DICOM: Digital Imaging and Communications in Medicine

This article is not assessed by the IB but may be helpful to deepen your understanding. Plus, I think it's cool.

The Big Idea

In medical imaging, DICOM (Digital Imaging and Communications in Medicine) is the global standard for storing, transmitting, and interpreting images such as MRI, CT, PET, and X-ray scans. It defines not only the file format but also the network communication protocol that allows medical systems—scanners, workstations, and PACS servers—to interoperate. The technical depth of DICOM lies in its combination of image pixel data and rich metadata, which together describe exactly how the image should be rendered, interpreted, and associated with patient and study information.

 

1. Structure of a DICOM File

A DICOM file integrates binary image data (pixel values) and descriptive metadata (headers). The structure can be visualized as:

[128-byte preamble]
[4-byte "DICM" magic word]
[Data Elements]
    ├── Patient information
    ├── Study and series descriptors
    ├── Image acquisition parameters
    └── Pixel Data (binary matrix)
  • Preamble: Reserved space for compatibility with non-DICOM systems.
  • Magic Word ("DICM"): Marks the file as a DICOM object.
  • Data Elements: Structured as Tag, Value Representation (VR), Value Length, and Value Field.
  • Pixel Data (Tag 7FE0,0010): Contains the raw image—usually a 2D or 3D array of integer pixel intensities.

Each data element is identified by a Group, Element pair (for example, (0010,0010) = Patient Name). This makes the format self-describing, machine-readable, and extensible.

 

2. Pixel Representation and Image Encoding

At its core, a DICOM image is a raster of pixels representing physical measurements such as X-ray attenuation or MRI signal intensity. These are stored as unsigned or signed integer values, typically 8, 12, 16, or 32 bits per pixel.

  • Rows (0028,0010) and Columns (0028,0011) define the image matrix dimensions.
  • Bits Allocated (0028,0100) defines how many bits are used for each pixel.
  • Bits Stored (0028,0101) indicates how many of those bits contain meaningful information.
  • High Bit (0028,0102) marks the position of the most significant bit.
  • Pixel Representation (0028,0103) distinguishes between unsigned (0) and signed (1) pixel values.

For example, a 512 × 512 CT slice using 16-bit signed integers consumes:

512 × 512 × 16 bits = 4,194,304 bits ≈ 0.5 MiB per slice

These values are often monochrome (grayscale) and are later mapped through a window center and width (Tags 0028,1050 and 0028,1051) to display tissue contrast dynamically.

 

A raster of pixels is a two-dimensional grid (or matrix) of discrete picture elements—pixels—each representing the brightness or color of a small area of an image.

In technical terms, a raster defines how image data is spatially organized in memory:

  • Each pixel has a fixed position in a rectangular coordinate system (row, column).
  • Pixel values are typically stored in row-major order (left-to-right, top-to-bottom).
  • Each pixel’s value corresponds to intensity (for grayscale images) or to a triplet of color components (for RGB images).

For example, a 512 × 512 grayscale raster is stored as a 2D array of 262,144 integer values—each encoding the light intensity of one point in the image.

In DICOM and other scientific formats, these pixel matrices represent measured physical data (like X-ray attenuation) rather than just visual color, making the raster both a visual and quantitative representation of the scanned object.

 

 

 

3. From Pixel Data to Image

Unlike common formats such as PNG or JPEG, DICOM separates how data is stored from how it is viewed.

  • The pixel matrix encodes physical intensity values (not colors).
  • A lookup transformation maps pixel intensity to brightness on-screen.
  • Optional compression (e.g., JPEG 2000, RLE) may be applied to the Pixel Data element but is specified in the metadata, ensuring decompression consistency across systems.

When visualized, DICOM viewers use the Rescale Intercept (0028,1052) and Rescale Slope (0028,1053) to convert raw pixel values into meaningful physical units such as Hounsfield units (HU) for CT scans:

HU = PixelValue × RescaleSlope + RescaleIntercept

This step transforms raw integer matrices into clinically interpretable images.

 

4. DICOM and Multi-Frame Data

Modern DICOM supports multi-frame objects, such as MRI time series or CT volume stacks. Each frame is stored sequentially within the same file, accompanied by per-frame functional groups describing acquisition timing, orientation, and slice position.

  • Number of Frames (0028,0008) defines how many frames exist.
  • Each frame’s data is contiguous or encapsulated in a JPEG/JPEG2000 stream.
  • Pixel data for multi-slice studies may exceed hundreds of megabytes.

 

5. Metadata and Interoperability

Metadata ensures interoperability and traceability. Key categories include:

CategoryExample TagExample Value
Patient(0010,0010)"DOE^JOHN"
Study(0020,000D)UID of study instance
Series(0020,000E)UID of series instance
Modality(0008,0060)"CT"
Image Orientation(0020,0037)“1\0\0\0\1\0”
Pixel Spacing(0028,0030)“0.625\0.625” mm

Each tag provides precise contextual information for reconstructing the image in correct spatial, temporal, and diagnostic context.

 

6. DICOM vs. Conventional Image Formats

FeatureDICOMPNG/JPEG
Pixel dataUsually 16-bit or higher grayscale8-bit RGB
MetadataEmbedded in standard tagsMinimal (EXIF optional)
CompressionOptional (lossless or lossy)Always compressed
Multi-frameSupportedLimited
Medical contextPatient, study, device metadataAbsent
Network transferDICOM protocol (C-STORE, C-FIND, etc.)None defined

This distinction explains why a DICOM file cannot be “opened” like a normal image without specialized software—it must interpret both the pixel matrix and its accompanying metadata to render a meaningful view.

 

7. Technical Summary

ComponentPurposeExample
File Meta InformationProtocol and encoding infoTransfer Syntax UID
Header ElementsPatient, study, device infoPatient ID, Modality
Image AttributesImage geometry and calibrationRows, Columns, Pixel Spacing
Pixel DataActual image matrix512×512 16-bit signed values
CompressionOptional encapsulationJPEG2000 (lossy/lossless)

 

8. Command Term Example

  • Describe: A weaker answer might say, “DICOM stores medical images with patient data.”
  • Good IB-level answer: “DICOM integrates pixel matrices (Tag 7FE0,0010) with structured metadata such as acquisition parameters and patient identifiers, stored in standardized data elements. This ensures interoperability between imaging systems and consistent pixel interpretation across modalities.”

 

Summary

DICOM is more than just an image format—it is a data communication ecosystem designed for medical imaging integrity. It encodes both how each pixel value should be interpreted and who/what/where it represents. Understanding DICOM’s pixel representation, metadata architecture, and transport mechanisms is essential for any computer scientist or biomedical engineer working at the intersection of digital imaging, data systems, and clinical informatics.