Inside a CPU core and a GPU core

This article is not assessed by the IB but may be helpful to deepen your understanding. Plus, I think it's cool.

The Big Idea

Both the CPU (Central Processing Unit) and the GPU (Graphics Processing Unit) are made up of cores — miniature processing engines that actually perform instructions.
Each core contains the same basic functional units:

  • Control Unit (CU) – directs the flow of data and instructions
  • Arithmetic Logic Unit (ALU) – performs calculations and logical comparisons
  • Registers – ultra-fast storage for immediate values and addresses
  • Internal Buses – connect the parts of the core to move data quickly

Where they differ is in how many of these elements they have, how they’re organized, and what they’re optimized for.

 

CPU Core Structure

A CPU core is designed for general-purpose, sequential tasks — running an operating system, managing files, performing logical decisions, and controlling overall program flow.

Inside a CPU Core

  1. Arithmetic Logic Unit (ALU)
    Performs integer arithmetic (+, , ×, /) and logical operations (AND, OR, NOT).
  2. Control Unit (CU)
    Fetches instructions from memory, decodes them, and sends control signals to other parts of the CPU.
  3. Registers
    Store temporary values used during instruction execution. Common ones include:
    • Program Counter (PC): holds the address of the next instruction
    • Instruction Register (IR): holds the current instruction
    • Memory Address Register (MAR) and Memory Data Register (MDR): manage data transfers between memory and CPU
    • Accumulator (AC): stores intermediate arithmetic results
  4. Cache Memory
    Extremely fast memory close to the core. Cache levels (L1, L2, sometimes L3) reduce the time needed to fetch data from main memory.
  5. Pipelines and Branch Units (HL concept)
    Modern CPU cores use pipelining — dividing the fetch–decode–execute cycle into overlapping stages, so multiple instructions are in different stages of execution at once.
  6. Vector or SIMD Units (some CPUs)
    Handle operations on multiple data elements simultaneously — useful for multimedia or numerical work.

Key point:
Each CPU core is powerful but limited in number (often 4–16 cores). Each core handles a few threads of complex, branching logic.

 

GPU Core Structure

A GPU core is designed for massively parallel tasks — running thousands of very small, similar computations simultaneously.

Inside a GPU Core

While CPUs emphasize control, GPUs emphasize throughput. A GPU is built from hundreds or thousands of simpler cores, each capable of performing simple arithmetic quickly.

  1. Streaming Multiprocessors (SMs)
    Groups of small execution units. Each SM has:
    • Multiple ALUs (sometimes called CUDA cores or shader units)
    • A control unit shared among them
    • A small register file and shared memory
  2. ALUs Everywhere
    Each GPU “core” has many ALUs — this allows vector or matrix operations on huge data sets (for example, every pixel in an image).
  3. Minimal Control Logic
    Because GPUs repeat the same instruction across large data sets, they use fewer, simpler control units. This saves space and power.
  4. Specialized Memory Hierarchy
    • Global memory (large, slower)
    • Shared memory within each multiprocessor (fast, for cooperation among cores)
    • Texture and constant memory optimized for graphics or AI workloads.
  5. Parallel Execution Model
    GPUs execute the same operation on many pieces of data simultaneously — called SIMD (Single Instruction, Multiple Data).

 

Comparing CPU and GPU Cores

FeatureCPU CoreGPU Core
PurposeGeneral-purpose processingHighly parallel numerical computation
ALU CountFew, complexMany, simple
Control UnitSophisticated per coreOne per many cores
Memory HierarchyLarge cachesMany small, fast local memories
ParallelismDozens of threadsThousands of threads
Best forLogic-heavy, sequential tasksData-heavy, repetitive tasks (graphics, AI)

 

In Summary

Inside every core — CPU or GPU — are the same basic building blocks: ALU, CU, registers, and buses.
What changes is the architecture philosophy:

  • CPUs are optimized for diverse instructions and complex control.
  • GPUs are optimized for massive data parallelism and arithmetic throughput.