hamburger

Database Management System Notes for IBPS (SO) IT Officer

By BYJU'S Exam Prep

Updated on: September 25th, 2023

A database management system is a software that is used to manage databases (DBMS). MySQL, Oracle, and other commercial DBMSs are examples of popular commercial DBMSs that are used in a variety of applications.

A DBMS is the acronym of the Data Base Management System is a collection of interrelated data and a set of programs to access those data. It is an important chapter for the upcoming IBPS SO exam 2021.

A DBMS is the acronym of the Data Base Management System is a collection of interrelated data and a set of programs to access those data. It manages a new large amount of data and supports efficient access to new large amounts of data.

  • Data (which is large and is being frequently modified)
  • Structure of data (which is small and stable in time)

You must go through the IBPS SO Exam Pattern and IBPS SO Syllabus before going forward.

Features of Database

  • Faithfulness: The design and implementation should be faithful to the requirements.
  • Avoid Redundancy: This value is important because of redundancy.
  • Simplicity: Simplicity requires that the design and implementation avoid introducing more elements than are absolutely necessary.
  • Right kind of element: Attributes are easier to implement but entity sets are relationships are necessary to ensure that the right kind of element is introduced.

You can practice Mock tests in the BYJU’S Exam Prep IBPS SO test series.

Types of Database

  • Centralized Database: All data is located at a single site.
  • Distributed Database: The database is stored on several computers.

The information contained in a database is represented on two levels:

image001

Database Management System (DBMS) provides efficient, reliable, convenient and safe multi-user storage of and access to massive amounts of persistent data.

image002

Key People Involved in a DBMS

  • DBMS Implementer: Person who builds a system
  • Database Designer: Person responsible for preparing external schemas for applications, identifying and integrating user needs into a conceptual (or community or enterprise) schema.
  • Database Application Developer: Person responsible for implementing database application programs that facilitate data access for end-users.
  • Database Administrator: Person responsible for define the internal schema, sub-schemas (with database designers) and specifying mappings between schemas, monitoring database usage and supervising DBMS functionality (e.g., access control, performance optimisation, backup and recovery policies, conflict management).
  • End Users: Users who query and update the database through fixed programs (invoked by non-programmer users) e.g., banking.

Levels of Data Abstraction

A 3-tier architecture separates its tiers from each other based on the complexity of the users and how they use the data present in the database. It is the most widely used architecture to design a DBMS.

  • Physical Level: It is the lowest level of abstraction and describes how the data are actually stored and complex low-level data structures in detail.
  • Logical Level: It is the next higher level of abstraction and describes what data are stored and what relationships exist among those data. At the logical level, each such record is described by a type definition and the interrelationship of these record types is defined as well. Database administrators usually work at this level of abstraction.
  • View Level: It is the highest level of abstraction and describes only part of the entire database and hides the details of the logical level.

Schema

A schema is also known as the database schema. It is a logical design of the database and a database instance is a snapshot of the data in the database at a given instant of time. A relational schema consists of a list of attributes and their corresponding domains.

Types of Schemas: It can be classified into three parts, according to the levels of abstraction 

  • Physical/Internal Schema: Describes the database design at the physical level.
  • Logical/Conceptual Schema/Community User View: Describes the database design at the logical level.
  • Sub-schemas /View/External Schema: Describes different views of the database views may be queried combined in queries with base relations, used to define other views in general not updated freely.

You can attend sessions by exam-qualified experts at IBPS SO Online Coaching. 

Data Model

A data model is a plan for building a database. Data models define how data is connected to each other and how they are processed and stored inside the system.

Two widely used data models are:

  • Object-based logical model
  • Record based logical model

What is Entity?

An entity may be an object with a physical existence or it may be an object with a conceptual existence. Each entity has attributes. A thing (animate or inanimate) of independent physical or conceptual existence and distinguishable. In the University database context, an individual student, faculty member, a classroom, a course are entities.

What are Attributes?

Each entity is described by a set of attributes/properties.

Types of Attributes

  • Simple Attributes: having atomic or indivisible values. example: Dept – a string Phone Number – an eight-digit number
  • Composite Attributes: having several components in the value. example: Qualification with components (Degree Name, Year, University Name)
  • Derived Attributes: Attribute value is dependent on some other attribute. example: Age depends on the Date Of Birth. So age is a derived attribute.
  • Single-valued: having only one value rather than a set of values. for instance, Place Of Birth – single string value.
  • Multi-valued: having a set of values rather than a single value. for instance, Courses Enrolled attribute for student Email Address attribute for student Previous Degree attribute for the student.
  • Attributes can be: simple single-valued, simple multi-valued, composite single-valued or composite multi-valued.

Keys

  • A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity.
  • A candidate key of an entity set is a minimal Online Classroom Program key
  • Customer-id is candidate key of customer
  • Account-number is candidate key of the account
  • Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.

Keys for Relationship Sets

The combination of primary keys of the participating entity sets forms an Online Classroom Program key of a relationship set.

(customer-id, account-number) is the super key of the depositor

  • NOTE: this means a pair of entity sets can have at most one relationship in a particular relationship set.
  • If we wish to track all access-dates to each account by each customer, we cannot assume a relationship for each access. We can use a multivalued attribute though.
  • Must consider the mapping cardinality of the relationship set when deciding what are the candidate keys.
  • We need to consider the semantics of the relationship set in selecting the primary key in case of more than one candidate key.

ER Modeling

The Entity-Relationship model (ER model) in software engineering is an abstract way to describe a database. Describing a database usually starts with a relational database, which stores data in tables.

Notations/Shapes in ER Modeling

image001

 

Notations/Shapes in ER Modeling: 

image002

The overall logical structure of a database can be expressed graphically by an E-R diagram. The diagram consists of the following major components.

  • Rectangles: represent entity sets.
  • Ellipses: represent attributes.
  • Diamonds: represents relationship sets.
  • Lines: links attribute set to entity set and entity set to relationship set.
  • Double ellipses: represent multi-valued attributes.
  • Dashed ellipses: denote derived attributes.
  • Double lines: represent the total participation of an entity in a relationship set.
  • Double rectangles: represent weak entity sets.

You need to solve the IBPS SO Previous year paper to understand the types of questions asked from this chapter.

Mapping Cardinalities / Cardinality Ratio / Types of Relationship

Expresses the number of entities to which another entity can be associated via a relationship set. For a binary relationship set R between entity sets A and B, the mapping cardinality must be one of the following:

  • One to One: An entity in A is associated with at most one entity in B and an entity in B is associated with at most one entity in A.
  • One to Many: An entity in A is associated with any number (zero or more) of entities; in B. An entity in B, however, can be associated with at most one entity in A.
  • Many to Many: An entity in A is associated with any number (zero or more) c entities in B and an entity B is associated with any number (zero or more) of entities in A.

image003

Specialization: Consider an entity set person with attributes name, street, and city, A person may be further classified as one of the following: Customer, and Employee. Each of these personality types is described by a set of attributes 1 includes all the attributes of entity set person plus possibly additional attributes. The process of designating subgroupings within an entity set is called specialization.

The specialization of person allows us to distinguish among persons according to whether they are employees or customers,

The refinement from an initial entity set into successive levels of entity subgroupings represents a top-down design process in which distinctions are made explicitly.

Generalization: Basically generalization is a simple inversion of specialization. Some common attributes of multiple entity sets are chosen to create a higher-level entity set. If the customer entity set and the employee entity set are having several attributes in common, then this commonality can be expressed by generalization.

Here, a person is the higher-level entity set and the customer and employee are lower-level entity sets. Higher and lower-level entity sets also may be designated by the terms superclass and subclass, respectively.

Aggregation: Aggregation is used when we have to model a relationship involving an entity set and a relationship set. Aggregation is an abstraction through which relationships are treated as higher-level entities.

Relational Algebra

A Relational model is completely based on relational algebra. It consists of a collection of operators that operate on relations. Its main objective is data retrieval. It is more operational and very much useful to represent execution plans, while relational calculus is non-operational and declarative. Here, declarative means users define queries in terms of what they want, not in terms of how to compute it.

Basic Operation in Relational Algebra

The operations in relational algebra are classified as follows.

Selection (σ): The select operation selects tuples/rows that satisfy a given predicate or condition. We use (σ) to denote selection. The predicate/condition appears as a subscript to σ.

Projection (π): It selects only required/specified columns/attributes from a given relation/table. Projection operator eliminates duplicates (i.e., duplicate rows from the result relation).

Union (): It forms a relation from rows/tuples which are appearing in either or both of the specified relations. For a union operation, R ∪ S to be valid, below two conditions must be satisfied.

  • The relations Rand S must be of the same entity i.e., they must have the same number of attributes.
  • The domains. of i th attribute of R and i th attribute of S must be the same, for all i.

Intersection (): It forms a relation of rows/ tuples which are present in both the relations R and S. Ensure that both relations are compatible for union and intersection operations.

Set Difference (-): It allows us to find tuples that are in one relation but are not in another. The expression R – S produces a relation containing those tuples in R but not in S.

Cross Product/Cartesian Product (×): Assume that we have n1 tuples in R and n2 tuples in S. Then, there are n1 * n2 ways of choosing a pair of tuples; one tuple from each relation. So, there will be (n1 * n2) tuples in result relation P if P = R × S.

y conditions to be satisfied by the data values in the relational instances so that the set of data values constitute a meaningful database.

Integrity Constraints

There are four types of Integrity constraints

Domain Constraint: The value of the attribute must be within the domain.

Key Constraint: Every relation must have a primary key.

Entity Integrity Constraint: The primary key of relation should not contain NULL values.

Referential Integrity Constraint: In the relational model, two relations are related to each other over the basis of attributes. Every value of referencing attributes must be NULL or be available in the referenced attribute.

There Schema Refinement/Normalization

Decomposition of complex records into simple records. Normalization reduces redundancy using the non-loss decomposition principle.

Decomposition

Splitting a relation R into two or more sub relation R1 and R2. A fully normalized relation must have a primary key and a set of attributes.

Decomposition should satisfy: (i) Lossless join, and (ii) Dependency preserverance

Lossless Join Decomposition

Join between the sub relations should not create any additional tuples or there should not be a case such that more number of tuples in R1 than R2

R ⊆ R1  R2 ⇒ (Lossy)

R ≡ R1  R2 ⇒ (Lossless)

Dependency Preservence: Because of decomposition, there must not be loss of any single dependency.

Functional Dependency (FD): Dependency between the attribute is known as functional dependency. Let R be the relational schema and X, Y be the non-empty sets of attributes and t1, t2, … ,tn are the tuples of relation R. X → Y {values for X functionally determine values for Y}

Trivial Functional Dependency: If X ⊇ Y, then X → Y will be trivial FD.

image002

Here, X and Y are a set of attributes of a relation R.

In trivial FD, there must be a common attribute at both sides of ‘→’ arrow.

Non-Trivial Functional Dependency: If X ∩ Y = φ (no common attributes) and X → Y satisfies FD, then it will be a non-trivial FD.

(no common attribute at either side of ‘→’ arrow)

image003

Case of semi-trivial FD

Sid → Sid Sname (semi-trivial)

Because of decomposition, we will get

Sid → Sid (trivial FD) and

Sid → Sname (non-trivial FD)

Properties of Functional Dependence (FD)

  • Reflexivity:  If X ⊇ Y, then X → Y (trivial)
  • Transitivity:  If X → Y and Y → Z, then X → Z
  • Augmentation: If X → Y, then XZ → YZ
  • Splitting or Decomposition: If X → YZ, then X → Y and X → Z
  • Union: If X → Y and X → Z, then X → YZ

Attribute Closure: Suppose R(X, Y, Z) be a relation having a set of attributes i.e., (X, Y, Z), then (X+) will be an attribute closure which functionally determines other attributes of the relation (if not all then at least itself).

Normal Forms/Normalization

In relational database design, normalization is the process for organizing data to minimize redundancy. Normalization usually involves dividing a database into two or more tables and defining the relationship between the tables. The normal forms define the status of the relation about the individuated attributes. There are five types of normal forms

First Normal Form (1NF): Relation should not contain any multivalued attributes or relation should contain atomic attributes. The main disadvantage of 1NF is high redundancy.

Second Normal Form (2NF): Relation R is in 2NF if and only if R should be in 1NF, and R should not contain any partial dependency.

Partial Dependency: Let R be the relational schema having X, Y, A, which are a non-empty set of attributes, where X = Any candidate key of the relation, Y = Proper subset of any candidate key, and A = Non-prime attribute (i.e., A doesn’t belong to any candidate key)

image004

In the above example, X → A already exists and if Y → A will exist, then it will become a partial dependency, if and only if

  • Y is a proper subset of a candidate key.
  • A should be a non-prime attribute.

If any of the above two conditions fail, then Y → A will also become a fully functional dependency.

Full Functional Dependency: A functional dependency P → Q is said to be a fully functional dependency if removal of any attribute S from P means that the dependency doesn’t hold anymore.

(Student_Name, College_Name → College_Address)

Suppose, the above functional dependency is a fully functional dependency, then we must ensure that there are no FDs as below.

(Student_Name → College_Address)

or (College_Name → Collage_Address)

Third Normal Form (3NF): Let R be a relational schema, then any non-trivial FD X → Y over R is in 3NF, if X should be a candidate key or Online Classroom Program key or Y should be a prime attribute.

  • Either both of the above conditions should be true or one of them should be true.
  • R should not contain any transitive dependency.
  • For a relation schema R to be a 3NF, it is necessary to be in 2NF.

Transitive Dependency: A FD, P → Q in a relation schema R is transitive if

  • There is a set of attributes Z that is not a subset of any key of R.
  • Both X → Z and Z → Y hold

image005

  • The above relation is in 2NF.
  • In relation to R1, C is not a candidate key and D is a non-prime attribute. Due to this, R1 fails to satisfy the 3NF condition. Transitive dependency is present here.

AB → C and C → D, then AB → D will be transitive.

Boycee Codd Normal Form (BCNF): Let R be the relation schema and X → Y be the any non-trivial FD over R is in BCNF if and only if X is the candidate key or Online Classroom Program key.

image006

If R satisfies this dependency, then of course it satisfy 2NF and 3NF.

image007

Summary of 1 NF, 2 NF and 3 NF:

image008

Fourth Normal Form (4NF): 4NF is mainly concerned with multivalued dependency A relation is in 4NF if and only if for every one of its non-trivial multivalued dependencies X →→Y, X is a Online Classroom Program key (i.e., X is either a candidate key or a superset).

Fifth Normal Form (5NF): It is also ‘known as Project Join Normal Form (PJ/NF). 5NF reduces redundancy in relational database recording multivalued facts by isolating semantically related multiple relationships. A table or relation is said to be in the 5NF, if and only if every join dependency in it is implied by the candidate keys.

What is SQL?

Structured Query language (SQL) is a language that provides an interface to relational database systems. SQL was developed by IBM in 1970, for use in system R and is a defector standard, as well as an ISO and ANSI standard.

  • To deal with the above database objects, we need a programming language and that programming language is known as SQL.

Three subordinate languages of SOL are :

Type of SQL Statement

SQL Keyword

Function

Data Definition Language(DDL)

CREATE

ALTER

DROP

TRUNCATE

Used to define change and drop the structure of a table

 

Used to remove all rows from a table

Data manipulation language(DML)

SELECT

INSERT INTO

UPDATE

DELETE  FROM

Used to enter, modify, delete and retrieve data from a table

Data Control Language (DCL)

GRANT

REVOKE

 

COMMIT

ROLLBACK

Used to provide control over the data in a database

 

Used to define the end of a transaction

Data Definition Language (DDL) : 

It includes the commands as 

  • CREATE To create tables in the database.
  • ALTER To modify the existing table structure:
  • DROP To drop the table with table structure.

Data Manipulation Language(DML) :

It is used to insert, delete, update data and perform queries on these tables. Some of the DML commands are given below.

  • INSERT To insert data into the table.
  • SELECT To retrieve data from the table.
  • UPDATE To-update existing data in the table.
  • DELETE To delete data from the table.

Data Control Language (DCL) :

It is used to control user’s access to the database objects. Some of the DCL commands are:

  • GRANT Used to grant select/insert/delete access.
  • REVOKE Used to revoke the provided access

Transaction Control Language (TCL): It is used to manage changes affecting the data.

  • COMMIT To save the work which is done, such as inserting or updating or deleting data to/from the table.
  • ROLLBACK To restore the database to the original state, since the last commit.
  • SQL Data Types SQL data types specify the type, size, and format of data/information that can be stored in columns and variables.

Constraint Types with Description

Default Constraint: It is used to insert a default value into a column if no other value is specified at the time of insertion.

Syntax

CREATE TABLE Employee

{

Emp_idint NOT NULL,

Last_Name varchar (250),

City varchar (50)OEFAULT *BANGALURU*

}

DDL Commands

  1. CREATE TABLE < Tab1e_Name>
    {
      Co1umn_name 1< data_type >,
      Column_name 2 < d’lta_type >
    }
  2. ALTER TABLE < Table_Name>
    ALTER Column < Column_Name> SET NOT NULL
  3. RENAME < object_type >object_name > to
  4. DROP TABLE

DML Commands

SELECT A1, A2, A3……,An what to return

FROM R1, R2, R3, ….., Rm relations or table

WHERE condition filter condition i.e., on what basis, we want to restrict the outcome/result.

If we want to write the above SQL script in the form of relational calculus, we use the following syntax

Comparison operators which we can use in filter condition are (=, >, <, > = , < =, < >,) ‘< >’ means not equal to.

INSERT Statement: Used to add row (s) to the tables in a database

INSERT INTO Employee (F_Name, L_Name) VALUES (‘Atal’, ‘Bihari’)

UPDATE Statement: It is used to modify/update or change existing data in a single row, group of rows or all the rows in a table.

Example:

//Updates some rows in a table.

UPDATE Employee

SET City = ‘LUCKNOW’

WHERE Emp_Id BETWEEN 9 AND 15;

//Update city column for all the rows

UPDATE Employee SET City=’LUCKNOW’;

DELETE Statement: This is used to delete rows from a table,

Example:

//Following query will delete all the rows from Employee table

DELETE Employee

Emp_Id=7;

DELETE Employee

ORDER BY Clause: This clause is used to, sort the result of a query in a specific order (ascending or descending), by default sorting order is ascending.

SELECT Emp_Id, Emp_Name, City FROM Employee

WHERE City = ‘LUCKNOW’

ORDER BY Emp_Id DESC;

GROUP BY Clause: It is used to divide the result set into groups. Grouping can be done by a column name or by the results of computed columns when using numeric data types.

  • The HAVING clause can be used to set conditions for the GROUPBY clause.
  • HAVING clause is similar to the WHERE clause, but having puts conditions on groups.
  • WHERE clause places conditions on rows.
  • WHERE clause can’t include aggregate: function while HAVING conditions can do so.

Example:

SELECT Emp_Id, AVG (Salary)

FROM Employee

GROUP BY Emp_Id

HAVING AVG (Salary) > 25000;

Aggregate Functions

Joins:  Joins are needed to retrieve data from two tables’ related rows on the basis of some condition which satisfies both the tables. Mandatory condition to join is that at least one set of column (s) should be taking values from the same domain in each table.

Inner Join: Inner join is the most common join operation used in applications and can be regarded as the default join-type. Inner join creates a new result table by combining column values of two tables (A and B) based upon the join-predicate. These may be further divided into three parts.

  1. Equi Join (satisfies equality condition)
  2. Non-Equi Join (satisfies non-equality condition)
  3. Self Join (one or more column assumes the same domain of values).

Outer Join: An outer join does not require each record in the two joined tables to have a matching record. The joined table retains each record-even if no other matching record exists.

Considers also the rows from the table (s) even if they don’t satisfy the joining condition

(i) Right outer join (ii) Left outer join (iii) Full outer join

image010

Left Outer Join: The result of a left outer join for table A and B always contains all records of the left table (A), even if the join condition does not find any matching record in the right table (B).

Result set of T1 and T2

Right Outer Join: A right outer closely resembles a left outer join, except with the treatment of the tables reversed. Every row from the right table will appear in the joined table at least once. If no matching with the left table exists, NULL will appear.

image011

Result set of T1 and T2

image012

Full Outer Join: A full outer join combines the effect of applying both left and right outer joins where records in the FULL OUTER JOIN table do not match, the result set will have NULL values for every column of the table that lacks a matching row for those records that do match, as single row will be produced in the result set.

image013

Result set of T1 and T2 (Using tables of previous example)

image014

Cross Join (Cartesian product): Cross join returns the Cartesian product of rows from tables in the join. It will produce rows that combine each row from the first table with each row from the second table.

Select * FROM T1, T2

Number of rows in result set = (Number of rows in table 1 × Number of rows in table 2)

Result set of T1 and T2 (Using previous tables T1 and T2)

image015

Structure Storage

The storage structure can be divided into two categories:

Volatile storage: As the name suggests, volatile storage cannot survive system crashes. Volatile storage devices are placed very close to the CPU; normally they are embedded onto the chipset itself. For example, the main memory and cache memory are examples of volatile storage. They are fast but can store only a small amount of information.

Non-volatile storage: These memories are made to survive system crashes. They are huge in data storage capacity, but slower in accessibility. Examples may include hard-disks, magnetic tapes, flash memory, and non-volatile (battery backed up) RAM.

File Organisation:

The database is stored as a collection of files. Each file is a sequence of records. A record is a sequence of fields. Data is usually stored in the form of records. Records usually describe entities and their attributes. e.g., an employee record represents an employee entity and each field value in the record specifies some attributes of that employee, such as Name, Birth-date, Salary or Supervisor.

Allocating File Blocks on Disk: There are several standard techniques for allocating the blocks of a file on disk

  • Contiguous Allocation: The file blocks are allocated to consecutive disk blocks. This makes reading the whole file very fast.
  • Linked Allocation: In this, each file contains a pointer to the next file block.
  • Indexed Allocation: Where one or more index blocks contain pointers to the actual file blocks.

Files of Unordered Records (Heap Files): In the simplest type of organization records are placed in the file in the order in which they are inserted, so new records are inserted at the end of the file. Such an organization is called a heap or pile file.

This organization is often used with additional access paths, such as the secondary indexes.

In this type of organization, inserting a new record is very efficient. Linear search is used to search a record.

Files of Ordered Records (Sorted Files): We can physically order the records of a file on disk based on the values of one of their fields called the ordering field. This leads to an ordered or sequential file. If the ordering field is also a key field of the file, a field guaranteed to have a unique value in each record, then the field is called the ordering key for the file. Binary searching is used to search a record.

Indexing Structures for Files: Indexing mechanisms are used to optimize certain accesses to data (records) managed in files. e.g., the author catalog in a library is a type of index. Search key (definition) an attribute or combination of attributes used to look-up records in a file.

An index file consists of records (called index entries) of the form.

image002

Index files are typically much smaller than the original file because only the values for the search key and pointer are stored. The most prevalent types of indexes are based on ordered files (single-level indexes) and tree data structures (multilevel indexes).

Types of Single Level Ordered Indexes: In an ordered index file, index entries are stored sorted by the search key value. There are several types of ordered Indexes

Primary Index: A primary index is an ordered file whose records are of fixed length with two fields. The first field is of the same data type as the ordering key field called the primary key of the data file and the second field is a pointer to a disk block (a block address).

  • There is one index entry in the index file for each block in the data file.
  • Indexes can also be characterized as dense or sparse.
  • Dense index A dense index has an index entry for every search key value in the data file.
  • Sparse index A sparse index (non-dense), on the other hand, has index entries for only some of the search values.
  • A primary index is a non-dense (sparse) index since it includes an entry for each disk block of the data file rather than for every search value.

Clustering Index: If file records are physically ordered on a non-key field which does not have a distinct value for each record that field is called the clustering field. We can create a different type of index, called a clustering index, to speed up retrieval of records that have the same value for the clustering field.

  • A clustering index is also an ordered file with two fields. The first field is of the same type as the clustering field of the data file.
  • The record field in the clustering index is a block pointer.
  • A clustering index is another example of a non-dense index.

Secondary Index: A secondary index provides a secondary means of accessing a file for which some primary access already exists. The secondary index may be on a field which is a candidate key and has a unique value in every record or a non-key with duplicate values. The index is an ordered file with two fields. The first field is of the same data type as some non-ordering field of the data file that is an indexing field. The second field is either a block pointer or a record pointer. A secondary index usually needs more storage space and longer search time than does a primary index.

Multilevel Indexes: The idea behind a multilevel index is to reduce the part of the index. A multilevel index considers the index file, which will be referred to now as the first (or base) level of a multilevel index. Therefore, we can create a primary index for the first level; this index to the first level is called the second level of the multilevel index and so on.

Dynamic Multilevel Indexes Using B-Trees and B+ -Trees: There are two multilevel indexes

B-Trees

  • When data volume is large and does not fit in memory, an extension of the binary search tree to the disk-based environment is the B-tree.
  • In fact, since the B-tree is always balanced (all leaf nodes appear at the same level), it is an extension of the balanced binary search tree.
  • The problem which the B-tree aims to solve is given a large collection of objects, each having a key and a value, design a disk-based index structure which efficiently supports query and update.
  • A B-tree of order p, when used as an access structure on a key field to search for records in a data file, can be defined as follows
    1. Each internal node in the B-tree is of the form

      where q ≤ p
      Each Pi is a tree pointer to another node in the B-tree.
      Each  is a data pointer to the record whose search key field value is equal to Kj.

    2. Within each node, K1 < K2 < …. < Kq–1
    3. Each node has at most p tree pointers.
    4. Each node, except the root and leaf nodes, has at least [(p/2)] tree pointers.
    5. A node within q tree pointers q ≤ p, has q – 1 search key field values (and hence has q –1 data pointers).
      e.g., A B-tree of order p = 3. The values were inserted in the order 8, 5, 1, 7, 3, 12, 9, 6.

B+ Trees

  • It is the variation of the B-tree data structure.
  • image005
  • In a B-tree, every value of the search field appears once at some level in the tree, along with a data pointer. In a B+-tree, data pointers are stored only at the leaf nodes of the tree. Hence, the structure of the leaf nodes differs from the structure of internal nodes.
  • The pointers in the internal nodes are tree pointers to blocks that are tree nodes whereas the pointers in leaf nodes are data pointers.
  • B+ Tree’s Structure: The structure of the B+-tree of order p is as follows
    1. Each internal node is of the form < Pl, K1, P2, K2, …. ,Pq–1, Kq–1, Pq> where, q ≤ P and each Pi is a tree pointer.
    2. Within each internal node, K1 < K2 < K3…. < Kq–1.
    3. Each internal node has at most p tree pointers and except root, has at least [(p/ 2)] tree pointers.
    4. The root node has at least two tree pointers if it is an internal node.
    5. Each leaf node is of the form:  where, q ≤ p, each is a data pointer and Pnext points to the next leaf node of the B+-trees.

Also, you can check:

IBPS SO 2021 Salary IBPS SO 2021 Selection Process
IBPS SO 2021 Eligibility Criteria IBPS SO 2021 Notification
IBPS SO Cut-off IBPS SO Exam Pattern

If you are aiming to crack IBPS SO 2021, then join the Online Classroom Program where you will get:

  • Live Courses by Top Faculty
  • Daily Study Plan 
  • Comprehensive Study Material 
  • Latest Pattern Test Series 
  • Complete Doubt Resolution 
  • Regular Assessments with Report Card

Database Management System Notes for IBPS (SO) IT Officer

Click here to access Test Series

Why BYJU’S Exam Prep Test Series?

  • Get better at every level of your preparation with unlimited test series, based on the latest exam pattern. 
  • Assess areas of weaknesses & track improvements with in-depth performance analysis

Download the BYJU’S Exam Prep App Now. 

The most comprehensive exam prep app.

#DreamStriveSucceed

IBPS-SO :: IT Officer

Our Apps Playstore
POPULAR EXAMS
SSC and Bank
Other Exams
GradeStack Learning Pvt. Ltd.Windsor IT Park, Tower - A, 2nd Floor, Sector 125, Noida, Uttar Pradesh 201303 help@byjusexamprep.com
Home Practice Test Series Premium