Chapter 1: Database Management System (DBMS) - Grade 12

Overview of Database Management System

This chapter covers fundamental concepts of database management systems, including data organization, database models, normalization techniques, and practical SQL implementation.

Database Management System

1. What is Database and DBMS? Advantages and Disadvantages of DBMS

What is a Database?

A database is an organized collection of related data that can be easily accessed, managed, and updated. It stores data in structured formats such as tables.

What is a DBMS?

A Database Management System (DBMS) is software that interacts with users and applications to capture and analyze data, and manages the database. It allows creation, updating, querying, and administration of databases.

Advantages of DBMS:

Disadvantages of DBMS:

Terminologies in Database

Some of the important terms used in database systems are:

Data Types

Data types define the kind of values that can be stored in a table field. Common examples include:

Data Dictionary

A Data Dictionary is a repository that contains information about the structure of the database. It stores metadata such as:

It acts as a reference for database administrators and developers.

2. File Processing System vs DBMS

Feature File Processing System DBMS
Data Redundancy High; same data stored in multiple files Low; integrated data storage
Data Consistency Poor; hard to keep data consistent across files High; enforced through constraints and rules
Data Sharing Limited; data locked to specific applications Efficient; multiple users and applications can share data
Data Integrity Manual enforcement; prone to errors Automatic integrity constraints (keys, rules)
Security Basic or none Robust access controls and security mechanisms

3. Types of Database Models

Database models define the logical design and structure of a database. They specify how data is stored, related, and manipulated. The following are the most common types of database models:

1. Hierarchical Model

The hierarchical model organizes data into a tree-like structure, where each parent node can have multiple child nodes, but each child has only one parent. It is suitable for representing data with a clear hierarchy, such as organizational charts or file systems.

Hierarchical Database Model

2. Network Model

The network model represents data using nodes and set structures. Unlike the hierarchical model, a child (called a member) can have multiple parents, supporting many-to-many relationships. This model is more flexible for complex applications.

Network Database Model

3. Relational Model

The relational model organizes data into tables (relations) made of rows (tuples) and columns (attributes). Relationships between tables are established using keys. This is the most widely used model in modern database systems.

Relational Database Model

4. Entity-Relational Model (ER Model)

The Entity-Relational (ER) model represents the database in terms of entities, attributes, and relationships. It is often used at the design stage to model real-world concepts before implementation in a relational database.

Entity Relational Database Model

4. What is a Relational Database? How is it Different?

A relational database stores data in tables (relations) with rows (tuples) and columns (attributes). Each table has a primary key to uniquely identify rows.

Differences from other models:

5. What is Data Redundancy? How Does DBMS Help Reduce It?

Data redundancy means the same piece of data is stored unnecessarily in multiple places. This causes inconsistency, increased storage, and maintenance issues.

How DBMS reduces redundancy:

6. Centralized vs Distributed Database Systems

Feature Centralized Database Distributed Database
Data Location Stored at a single central location Stored across multiple physical locations
Management Managed from one site Multiple sites manage their portions
Advantages Easy to manage, consistent data Improved reliability, availability, and performance
Disadvantages Single point of failure; slower access for remote users Complex management and synchronization

7. Database Administrator (DBA) and Responsibilities

Database Administrator (DBA) is the person responsible for managing and maintaining the DBMS environment.

Major Responsibilities:

Integrity Constraints and Their Types

Integrity constraints are the rules that ensure the accuracy, validity, and consistency of data in a database. They act as a safeguard to prevent invalid or inconsistent data from being entered or maintained in the system. Without integrity constraints, databases would quickly become unreliable, as users could insert contradictory or incomplete information. These rules are enforced by the DBMS automatically whenever data is inserted, updated, or deleted.

In essence, integrity constraints maintain the quality of the data and preserve the logical relationships between tables. They are one of the key reasons why databases are more reliable than traditional file systems.

Types of Integrity Constraints

1. Domain Integrity Constraint

Domain constraints define the valid set of values for an attribute (column). They restrict the type, format, and permissible range of values that can be stored in a column. For example, the column Age should only accept positive integers, and the column Email should follow the email format.


    CREATE TABLE Students (
        StudentID INT PRIMARY KEY,
        Name VARCHAR(50) NOT NULL,
        Age INT CHECK (Age > 0 AND Age < 120)
    );
    

Here, the domain constraint ensures that Age must always be between 1 and 119.

2. Entity Integrity Constraint

Entity integrity ensures that every table has a unique identifier (primary key) and that this key value is never NULL. It guarantees that each record (row) can be uniquely identified within the table. Without this, it would be impossible to distinguish between two rows of data.


    CREATE TABLE Employees (
        EmpID INT PRIMARY KEY,
        Name VARCHAR(50),
        Department VARCHAR(50)
    );
    

The primary key EmpID ensures every employee record is unique and identifiable.

3. Referential Integrity Constraint

Referential integrity maintains consistency between related tables through the use of foreign keys. It ensures that a value in a child table must correspond to an existing value in the parent table or be NULL (if allowed). This prevents the creation of "orphan" records.


    CREATE TABLE Departments (
        DeptID INT PRIMARY KEY,
        DeptName VARCHAR(50)
    );

    CREATE TABLE Employees (
        EmpID INT PRIMARY KEY,
        Name VARCHAR(50),
        DeptID INT,
        FOREIGN KEY (DeptID) REFERENCES Departments(DeptID)
    );
    

Here, every employee must be assigned to a valid department. If DeptID = 5 does not exist in Departments, it cannot be inserted into Employees.

4. Key Integrity Constraint

Key constraints enforce the uniqueness and proper identification of rows within a table. They define which attribute(s) can be used as keys and how they enforce uniqueness across records.


    CREATE TABLE Users (
        UserID INT PRIMARY KEY,
        Username VARCHAR(50) UNIQUE,
        Email VARCHAR(100) UNIQUE
    );
    

Here, UserID is the primary key, while Username and Email must be unique for each user.

Why Integrity Constraints Matter?

Integrity constraints form the backbone of reliable databases:

8. Normalization: Definition and Normal Forms

Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity.

First Normal Form (1NF):

Table is in 1NF if all columns contain atomic (indivisible) values, and there are no repeating groups.

Example:

StudentID | Name     | Subjects
--------------------------------
1         | Alice    | Math, Physics
2         | Bob      | Chemistry
    

Here, Subjects is not atomic. To convert to 1NF:

StudentID | Name  | Subject
---------------------------
1         | Alice | Math
1         | Alice | Physics
2         | Bob   | Chemistry
    

Second Normal Form (2NF):

Table is in 2NF if it is in 1NF and every non-key attribute is fully functionally dependent on the entire primary key (no partial dependency).

Example:

StudentID | CourseID | CourseName | Grade
-----------------------------------------
1         | C1       | Math       | A
1         | C2       | Physics    | B
    

If CourseName depends only on CourseID, not on the full composite key (StudentID, CourseID), separate into two tables:

Courses:
CourseID | CourseName
---------------------
C1       | Math
C2       | Physics

Grades:
StudentID | CourseID | Grade
----------------------------
1         | C1       | A
1         | C2       | B
    

Third Normal Form (3NF):

Table is in 3NF if it is in 2NF and all attributes are only dependent on the primary key (no transitive dependency).

Example:

StudentID | AdvisorID | AdvisorName
-----------------------------------
1         | 10        | Dr. Smith
2         | 11        | Dr. Jones
    

AdvisorName depends on AdvisorID, not StudentID. To convert:

Students:
StudentID | AdvisorID
---------------------
1         | 10
2         | 11

Advisors:
AdvisorID | AdvisorName
-----------------------
10        | Dr. Smith
11        | Dr. Jones
    

Advantages and Disadvantages of Normalization

Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves dividing large, complex tables into smaller, well-structured tables connected through relationships. While normalization improves efficiency and accuracy, it also comes with certain trade-offs. Understanding its pros and cons is essential for designing an effective database.

Advantages of Normalization

Disadvantages of Normalization

Conclusion

Normalization is a powerful technique for ensuring data accuracy and reducing redundancy, making it an essential step in designing reliable transactional databases. However, it should be applied wisely, balancing between data consistency and system performance. In practice, databases often use partial normalization (up to 3NF or BCNF) while denormalizing selectively for performance optimization.

Centralized and Distributed Database

Introduction

A Centralized Database is stored and maintained in a single location, usually a central computer or server. All users access the database through this central system. In contrast, a Distributed Database is spread across multiple locations (servers or sites), but the data appears as a single logical database to users. Both approaches have their own strengths and weaknesses depending on system requirements.

Advantages of Centralized Database

Disadvantages of Centralized Database

Advantages of Distributed Database

Disadvantages of Distributed Database

Comparison

Aspect Centralized Database Distributed Database
Storage Location Single central server Multiple servers/sites
Reliability Low – single point of failure High – continues even if one site fails
Performance May face bottlenecks with many users Faster as data is distributed closer to users
Maintenance Easy, centralized control Complex, requires coordination across sites
Cost Cheaper to implement Expensive due to multiple sites
Security Easier to enforce Harder, multiple entry points

Database Security

Introduction

Database Security refers to the collective measures used to protect the database against threats such as unauthorized access, data breaches, corruption, or loss. Since databases often store critical and sensitive information (such as financial records, personal data, and business transactions), ensuring their security is a top priority.

Challenges in Database Security

Security Measures

Effective database security is a continuous process that combines technical measures (like encryption and firewalls), administrative policies (like role-based access), and regular monitoring to safeguard valuable data against both internal and external threats.

9. Keys in DBMS

Primary Key

A unique identifier for each record in a table. It cannot be null.

Example: StudentID in a Student table.

Foreign Key

An attribute that creates a link between two tables by referring to the primary key in another table.

Example: AdvisorID in the Student table referring to AdvisorID in Advisors table.

Candidate Key

Attributes that can uniquely identify a record. One candidate key is chosen as the primary key.

Example: Both StudentID and Email could be candidate keys, but StudentID is chosen as primary key.

10. What is SQL? Components and Common Functions

SQL (Structured Query Language) is a standard programming language used to manage and manipulate relational databases.

Components of SQL:

Common SQL Functions:

Data and Information

Terminologies in Database

Database Management System (DBMS)

DBMS Architecture

Types of Database Models

Integrity Constraints

Normalization

Centralized vs Distributed Database

Feature Centralized Distributed
Data Location Single location Multiple locations
Advantages Easy to manage, consistent Faster access, fault-tolerant
Disadvantages Single point of failure Complex to manage

Database Security

Entity-Relationship (ER) Diagrams

An Entity-Relationship (ER) Diagram is a visual representation of entities, attributes, and the relationships among them in a database system. ER Diagrams are widely used during the database design phase because they provide a clear and structured way to represent data requirements. They serve as a blueprint for creating relational databases.

Entities

An Entity represents a real-world object or concept that can be identified distinctly in the database. Entities can be physical objects (like Student, Book, Employee) or abstract concepts (like Course, Department). In ER diagrams, entities are usually represented by rectangles.

Attributes

An Attribute describes a property or characteristic of an entity. For example, the entity Student may have attributes like Student_ID, Name, Age, Address. Attributes are represented by ellipses (ovals) in ER diagrams and connected to their entities.

Relationships

A Relationship shows how entities are connected to each other. For example, a Student enrolls in a Course. Relationships are represented by diamonds in ER diagrams.

Cardinality and Degree

Cardinality defines the number of instances of one entity that can be associated with instances of another entity (e.g., 1:1, 1:N, M:N). Degree of a relationship represents the number of entities involved in the relationship:

Symbols in ER Diagrams

ER Diagrams use a standard set of symbols to represent entities, attributes, and relationships. Below are the most common symbols:

Rectangle:
Represents an Entity.
Double Rectangle:
Represents a Weak Entity.
Ellipse (Oval):
Represents an Attribute.
Double Ellipse:
Represents a Multivalued Attribute.
Dashed Ellipse:
Represents a Derived Attribute.
Diamond:
Represents a Relationship.
Line:
Connects entities with relationships or attributes.

Practical Topics: DDL and DML

SQL Data Types

Data Type Description
CHAR Fixed-length character string
VARCHAR Variable-length character string
BINARY Fixed-length binary data
VARBINARY Variable-length binary data
TINYBLOB Very small BLOB (binary large object)
TINYTEXT Very small text string
TEXT Text string
LONGTEXT Large text string
ENUM String with one value from predefined list
BIT Bit-field type
TINYINT Very small integer
BOOLEAN True or false values
INTEGER Standard integer
FLOAT Single-precision floating-point
DOUBLE Double-precision floating-point
DECIMAL Fixed-point number
DATE Date value (YYYY-MM-DD)
DATETIME Date and time combination

11. Structured Query Language (SQL)

Structured Query Language (SQL) is a standardized programming language used to manage and manipulate relational databases. It is essential for defining database structures, storing and retrieving data, and controlling access to that data. SQL is supported by all major relational database management systems (RDBMS), including MySQL, PostgreSQL, Oracle, Microsoft SQL Server, and SQLite.

Key Features of SQL:

Major Components of SQL:

Importance of SQL:

SQL Commands to Perform Operations

  1. a. Create a table named students with fields: Id, Name, Class, and Marks.
    CREATE TABLE students (
      Id INT PRIMARY KEY,
      Name VARCHAR(50),
      Class VARCHAR(10),
      Marks INT
    );
      
    Output:
    Output of SQL command a SQL Generated Table
  2. b. Insert records into the students table.
    INSERT INTO students (Id, Name, Class, Marks) VALUES
      (1, 'Anita', '12', 78),
      (2, 'Bikash', '12', 82),
      (3, 'Chandra', '12', 90),
      (4, 'Dipesh', '11', 75);
      
    Output:
    Output of SQL command b SQL Generated Table
  3. c. Display all records from the students table.
    SELECT * FROM students;
      
    Output:
    Output of SQL command c
  4. d. Update the marks of the student whose name is 'Anita' to 85.
    UPDATE students
    SET Marks = 85
    WHERE Name = 'Anita';
      
    Output:
    Output of SQL command d SQL Generated Table
  5. e. Delete the record of the student with id = 3.
    DELETE FROM students
    WHERE Id = 3;
      
    Output:
    Output of SQL command e SQL Generated Table

Key Terms and Definitions

a. Data Dictionary: A data dictionary is a centralized catalog containing detailed information about the data elements within a database system. It documents attributes such as field names, data types, constraints, relationships, and access permissions. Acting as the metadata repository, it ensures consistency, enforces standards, and assists developers, DBAs, and end-users in understanding the database structure and usage.
b. Primary Key: A primary key is a column or set of columns in a relational table that uniquely identifies each record. It guarantees entity integrity by disallowing null and duplicate values. Primary keys are fundamental to relational design, serving as the anchor point for establishing relationships and enforcing uniqueness across the dataset.
c. Relationship: A relationship represents the logical association between two or more database tables. It facilitates the linking of data using keys—typically a primary key in one table and a foreign key in another. Relationships allow for scalable data organization, minimize redundancy, and enable efficient querying across interconnected records.
d. Data Manipulation Language (DML): DML is a subset of SQL that enables users to interact with and modify data within tables. It includes statements such as SELECT, INSERT, UPDATE, and DELETE. These operations empower users to retrieve, add, modify, or remove records while preserving the underlying table structure.
e. Structured Query Language (SQL): SQL is the industry-standard language for defining, managing, and querying relational databases. It encompasses multiple components—DDL, DML, DCL, and TCL—that together support everything from schema creation and data manipulation to user access control and transaction management. SQL’s declarative nature makes it versatile for both administrative and analytical tasks.
f. Data Integrity: Data integrity refers to the accuracy, consistency, and reliability of data throughout its lifecycle. It is enforced through rules, constraints, and keys—such as primary keys, foreign keys, and validation checks—to prevent anomalies and maintain trustworthiness of the stored information, especially in multi-user environments.
g. Data Definition Language (DDL): DDL is the SQL component responsible for defining and altering the structure of database objects. It includes commands like CREATE, ALTER, DROP, and TRUNCATE. These commands define tables, indexes, views, and schemas, forming the blueprint of the database system.
h. Data Security: Data security comprises policies, protocols, and technologies used to protect databases against unauthorized access, breaches, or malicious attacks. It involves encryption, access control, user authentication, and activity logging to ensure confidentiality, integrity, and availability of sensitive data assets.
i. Database System: A database system refers to the complete ecosystem that includes the database, the database management system (DBMS), and the application interfaces that interact with them. It is designed to efficiently handle data storage, access, security, backup, and multi-user operations while ensuring performance and reliability.

SQL JOIN

JOIN operations in SQL are used to combine rows from two or more tables based on a related column between them. They are essential for retrieving data that is spread across multiple tables in a relational database.

1. INNER JOIN

The INNER JOIN returns records that have matching values in both tables. Rows without matches are excluded from the result.


-- Example:
SELECT Students.StudentID, Students.Name, Courses.CourseName
FROM Students
INNER JOIN Courses
ON Students.CourseID = Courses.CourseID;
    

Explanation: This query returns only students who are enrolled in a course (matching CourseID in both tables).

2. LEFT JOIN (LEFT OUTER JOIN)

The LEFT JOIN returns all records from the left table, and the matched records from the right table. If no match is found, NULL values are returned for columns from the right table.


-- Example:
SELECT Students.StudentID, Students.Name, Courses.CourseName
FROM Students
LEFT JOIN Courses
ON Students.CourseID = Courses.CourseID;
    

Explanation: This query shows all students, even if they are not enrolled in any course. For those students, CourseName will be NULL.

3. RIGHT JOIN (RIGHT OUTER JOIN)

The RIGHT JOIN returns all records from the right table, and the matched records from the left table. If no match is found, NULL values are returned for columns from the left table.


-- Example:
SELECT Students.StudentID, Students.Name, Courses.CourseName
FROM Students
RIGHT JOIN Courses
ON Students.CourseID = Courses.CourseID;
    

Explanation: This query shows all courses, even if no student is enrolled in them. For such courses, StudentID and Name will be NULL.

4. FULL JOIN (FULL OUTER JOIN)

The FULL JOIN returns all records when there is a match in either left or right table. If there is no match, NULL values are returned from the missing side.


-- Example:
SELECT Students.StudentID, Students.Name, Courses.CourseName
FROM Students
FULL OUTER JOIN Courses
ON Students.CourseID = Courses.CourseID;
    

Explanation: This query shows all students and all courses. If a student is not enrolled in any course, CourseName will be NULL. If a course has no students, StudentID and Name will be NULL.

Conclusion

SQL JOINs allow us to query and analyze data across multiple tables efficiently. Choosing the right type of JOIN depends on whether we want only matched records (INNER JOIN) or unmatched records as well (LEFT, RIGHT, FULL JOIN).

Learning Outcomes

Frequently Asked Questions (FAQs)

Data refers to raw, unprocessed facts and figures without context or meaning, such as numbers, characters, or symbols collected from various sources. Information is the processed, organized, and meaningful interpretation of data, which helps users make decisions and understand the context. In essence, information is data that has been given relevance and purpose.

A Database Management System (DBMS) provides numerous advantages including:

  • Reduced Data Redundancy: Minimizes duplicate data storage by centralizing data management.
  • Improved Data Consistency: Ensures that all users access the same up-to-date data, preventing discrepancies.
  • Enhanced Security: Implements access controls and permissions to protect sensitive data from unauthorized users.
  • Efficient Data Sharing: Enables multiple users and applications to access and manipulate data concurrently without conflicts.
  • Backup and Recovery: Provides mechanisms for data backup and recovery to protect against data loss.

Normalization is a systematic process in database design used to organize data to minimize redundancy and dependency. It involves dividing large tables into smaller, related tables and defining relationships between them, ensuring that each piece of data is stored only once. This process helps maintain data integrity and improves database efficiency by reducing update anomalies.

A Database Administrator (DBA) is responsible for the installation, configuration, and maintenance of the database environment. Key roles include ensuring data security, managing user access, performing backup and recovery operations, optimizing database performance, and troubleshooting issues. The DBA acts as the guardian of the data, ensuring it is accurate, available, and secure for all users and applications.