A3.2.1 Describe database schemas.

A3.2.1 Describe database schemas. 
• Conceptual schema, logical schema, physical schema 
• Abstract definitions of the data structure and organization of the data at different levels

Big idea

A database schema is a formal description of how data is organized. It defines structure, relationships, constraints, and storage strategy. Schemas exist at three levels: conceptual, logical, and physical. Each level abstracts the database at a different depth, moving from real-world modeling to hardware-level implementation.

Understanding these levels is essential for applying computational thinking to database design. Another way of thinking about this is that a database schema is a layered blueprint. Each layer answers a different question:

• What data exists in the real world? (conceptual)
• How will we structure it in a database model? (logical)
• How will it actually be stored and accessed on hardware? (physical)

These three layers form an abstraction hierarchy. Each level becomes progressively more concrete and implementation-focused.

 

Conceptual Schema (High-Level Model)

The conceptual schema describes the database from a real-world perspective. It focuses on meaning, not implementation.

It answers:
“What data exists, and how are entities related?”

It includes:

• Entities (e.g., Student, Course, Teacher)
• Attributes (e.g., Student has name, ID, date_of_birth)
• Relationships (e.g., Student enrolls in Course)
• Cardinality (one-to-many, many-to-many)
• Business rules and constraints

It does not specify:

• Tables
• Data types
• Keys
• Storage mechanisms

Example (Conceptual Schema – School System)

Entities:
Student
Course
Teacher

Relationships:
Student enrolls in Course (many-to-many)
Teacher teaches Course (one-to-many)

This level is typically represented using an ERD (Entity Relationship Diagram).

Key characteristic:
Technology-independent and DBMS-independent.

 

Logical Schema (Data Model Structure)

The logical schema translates the conceptual model into a specific data model — usually the relational model.

It answers:
“How will this data be structured in tables?”

It includes:

• Tables (relations)
• Columns (attributes)
• Primary keys
• Foreign keys
• Data types
• Constraints (NOT NULL, UNIQUE, CHECK)
• Normalization decisions (1NF, 2NF, 3NF)

It still does not define how data is physically stored.

Example (Logical Schema – Relational Model)

Table: Students
student_id INT PRIMARY KEY
name VARCHAR(50)
date_of_birth DATE

Table: Courses
course_id INT PRIMARY KEY
title VARCHAR(100)

Table: Enrollments
student_id INT (FK → Students)
course_id INT (FK → Courses)
PRIMARY KEY (student_id, course_id)

Key characteristic:
DBMS-aware but hardware-independent.

 

Physical Schema (Storage Implementation)

The physical schema describes how the database is implemented on hardware. It focuses on storage structures, indexing, file organization, and performance tuning.

It answers:
“How will this data be stored and accessed efficiently?”

It includes:

• File organization (heap file, clustered file, hash file)
• Index structures (B-tree indexes, hash indexes)
• Page size and block size
• Tablespaces
• Partitioning strategy
• Compression settings
• Replication settings
• Storage engine configuration
• Disk layout
• I/O optimization

This is the level where performance engineering happens.

Example (Physical Schema – Same School Database)

Students table stored as:
• Clustered table on student_id
• B-tree index on name
• Partitioned by enrollment_year
• Stored in tablespace “academic_data”
• Page size = 8 KB

Enrollments table:
• Hash index on course_id
• Stored on SSD for faster read access

None of this changes the logical design. It only affects performance, scalability, and retrieval speed.

Key characteristic:
Hardware-dependent and performance-focused.


How the Three Levels Relate

Conceptual → Logical → Physical

Real-world meaning → Relational structure → Disk-level implementation

This layered abstraction supports data independence:

• Changes to physical storage should not affect logical structure.
• Changes to logical structure should not affect real-world conceptual meaning.

This separation is foundational in database theory.