A3.2.3 Outline the different data types used in relational databases.

A3.2.3 Outline the different data types used in relational databases. 
• The importance of data type consistency 
• The potential effects of choosing the wrong data type

 

The Big Idea

Relational databases support multiple data types to handle different kinds of information—numbers, text, dates, binary data, and structured formats. Choosing the right type helps ensure data integrity, query performance, and storage efficiency.


A. Numeric Types

These store numbers, either whole or fractional.

  • INTEGER (or INT): Stores whole numbers, e.g. 5, -100. Typically 4 bytes.
  • SMALLINT: Like INTEGER but smaller range. Uses less storage (2 bytes).
  • BIGINT: For very large integers. Uses more storage (8 bytes).
  • DECIMAL(p, s) or NUMERIC(p, s): Stores exact fractional numbers with precision.
    • p = precision (total number of digits)
    • s = scale (digits after decimal point)
    • E.g., DECIMAL(5,2) can store values like 123.45. Ideal for money.
  • FLOAT, REAL, DOUBLE PRECISION: Store approximate decimal numbers using floating-point representation.
    • Good for scientific calculations, but may introduce rounding errors.

B. Character/String Types

These store textual data like names, addresses, or descriptions.

  • CHAR(n): Fixed-length string.
    • Always uses n characters—even if the value is shorter (padded with spaces).
    • E.g., CHAR(10) stores 'dog' as 'dog '.
  • VARCHAR(n): Variable-length string, up to n characters.
    • More flexible and space-efficient than CHAR.

Note: Use CHAR only when strings are a fixed length (e.g., country codes like 'PL' or 'US').


C. Date and Time Types

Used to store calendar-based and time-related values.

  • DATE: Stores a calendar date (YYYY-MM-DD).
    • E.g., '2025-05-26'
  • TIME: Stores only the time of day (e.g., '14:30:00')
  • TIMESTAMP: Stores both date and time, often down to microseconds.
    • May include time zone information depending on the system.

These types are critical for applications that log activity, schedule events, or track transactions.


D. Boolean Type

Stores true/false values.

  • BOOLEAN: Logical value (TRUE, FALSE)
    • Often stored internally as 1 (true) or 0 (false)
    • Useful for flags like is_active, has_paid, etc.

E. Binary Types

Used for storing raw byte data—not human-readable.

  • BINARY(n): Fixed-length byte arrays.
  • VARBINARY(n): Variable-length byte arrays up to n bytes.
  • BLOB (Binary Large Object): For very large binary content like images, audio files, PDFs, etc.
    • Usually stored outside the main table and referenced internally.

F. Specialized Types (Vendor-specific)

These are advanced or non-standard types offered by specific database systems.

  • UUID / GUID: Universally Unique Identifiers, like '550e8400-e29b-41d4-a716-446655440000'
    • Often used as primary keys when distributed uniqueness is needed.
  • ARRAY: Stores a list of values of the same type.
    • E.g., PostgreSQL supports INTEGER[] or TEXT[].
  • JSON / XML: Allows storage of semi-structured data inside a relational table.
    • E.g., storing metadata, configuration settings, or nested structures.

These types support flexibility and interoperability, especially in modern web applications and APIs.


Summary Table

CategoryExample TypesTypical Use Case
NumericINT, DECIMAL, FLOATCounting, measurement, financial values
TextualCHAR, VARCHARNames, addresses, descriptions
Date and TimeDATE, TIME, TIMESTAMPScheduling, event logging
BooleanBOOLEANFlags and status indicators
BinaryBLOB, BINARY, VARBINARYMultimedia, documents, encrypted data
Specialized (vendor)UUID, JSON, XML, ARRAYAdvanced or flexible data storage

2. The Importance of Data Type Consistency

  • Data Integrity
    Enforcing consistent types prevents storing invalid values (e.g., text in a numeric column), ensuring that each column’s data adheres to its intended domain.
  • Storage Efficiency
    Fixed-length types (CHAR, INT) and appropriate precision (DECIMAL) minimize wasted space and improve cache performance.
  • Query Performance
    The database optimizer relies on type information to choose efficient execution plans; mismatched types can lead to implicit conversions and slow scans.
  • Constraint Enforcement
    Types underlie CHECK constraints, range enforcement, and arithmetic correctness.

3. Potential Effects of Choosing the Wrong Data Type

  1. Precision Loss
    • Using a FLOAT for financial data may introduce rounding errors; DECIMAL is preferable for exact arithmetic.
  2. Storage Waste or Truncation
    • Defining VARCHAR(255) for a field that never exceeds 20 characters wastes space; conversely, setting CHAR(10) for a 15-character field truncates data.
  3. Runtime Errors and Conversion Overhead
    • Inserting a string like “123ABC” into an INT column can cause failures; implicit casts during joins or filters degrade performance.
  4. Index Inefficiency
    • Indexes on wide or variable-length columns consume more resources and may be less selective.
  5. Loss of Semantic Meaning
    • Storing dates as TEXT forfeits built-in date arithmetic and validation features.

Choosing the wrong type risks precision errors, wasted space, slower queries and outright failures during data manipulation—undermining the very purpose of a relational database.