Interpreters vs. Compilers – what really happens when code runs

This article is not assessed by the IB but may be helpful to deepen your understanding. Plus, I think it's cool.

Interpreters vs. Compilers – what really happens when code runs

Programming languages do not “run themselves”; they need some mechanism to turn human-readable source into the electrical signals a CPU understands. Two classical mechanisms dominate:

MechanismCore ideaWhere execution decisions are made
InterpreterReads the source (or an intermediate byte-code) statement by statement and performs the requested actions immediately.Run time – each line or byte-code is decoded just before it executes.
CompilerTranslates the entire program into native machine code before any instruction executes.Build time – the heavy lifting happens once, producing an executable file.

Most modern runtimes blend the two (e.g., a byte-code interpreter that just-in-time compiles hot functions), but the pure forms illustrate the fundamental trade-offs.


1 Interpreter pipeline — step by step

  1. Lexing & parsing – convert source into a parse tree or byte-code.
  2. Dispatch loop – a small virtual machine fetches the next instruction, decodes it, and calls the routine that implements it (add, branch, print…).
  3. State updates – results are stored in an interpreter-managed stack or environment table.
  4. Repeat until no instructions remain.

The critical point is that each high-level instruction is revisited at run time, so the interpreter pays decoding cost every loop.


2 Compiler pipeline — step by step

  1. Front-end analysis – the compiler parses the whole program, builds an Abstract Syntax Tree, checks types, and reports errors.
  2. Optimisation – intermediate representation (IR) passes remove dead code, propagate constants, vectorise loops, etc.
  3. Code generation – IR is mapped to machine instructions for a specific CPU and operating-system ABI.
  4. Linking – object files and libraries are stitched together; the result is a native executable.

At run time the CPU reads already optimised instructions directly from memory; no further translation is needed.


3 A tiny worked example – adding three integers

High-level source (same logic, two languages)

# Python (interpreted by CPython)
a, b, c = 4, 7, 2
total = a + b + c
print(total)
/* C (compiled with GCC or clang) */
#include <stdio.h>
int main(void) {
    int a = 4, b = 7, c = 2;
    int total = a + b + c;
    printf("%d\n", total);
    return 0;
}

What the interpreter does

PhaseLive view just before print(total)
1. Load constantsStack = [4, 7, 2]
2. BINARY_ADDPop 2 + 7 → push 9 → Stack = [4, 9]
3. BINARY_ADDPop 9 + 4 → push 13 → Stack = [13]
4. PRINTPop 13, call low-level I/O routine

Each BINARY_ADD is a byte-code fetched by the VM; the real CPU is executing the VM’s dispatch loop hundreds of times per high-level operation.


What the compiler produces (x86-64 excerpt)

mov    eax, 4        ; load a
add    eax, 7        ; a += b
add    eax, 2        ; a += c
mov    esi, eax      ; prepare printf argument
mov    edi, fmt_ptr  ; "%d\n"
call   printf

The three add instructions are executed directly by the processor with almost no overhead.


4 Why the difference matters

DimensionInterpreterCompiler
Edit-run cycleInstant (no separate build step).Build can take seconds or minutes.
Start-upFast; program is already “ready”.Native binary loads fast, but large C++ binaries must relocate.
Peak speedLimited by continual decode/dispatch overhead.Near the theoretical maximum for the CPU.
PortabilityShip one script; any machine with a matching interpreter can run it.Need a separate binary per CPU/OS pair, or a cross-compiler.
Error time-lineMany errors appear only when the faulty line executes.Most syntax and type errors caught before the program runs.

A JIT-equipped runtime (Java, .NET, JavaScript V8, PyPy) lands in between: start quickly from byte-code, then compile the hot functions so they approach native performance after a warm-up period.


5 Choosing the right approach

Use-caseBest fitWhy
Exploratory data analysis, small scriptsInterpreterIterate instantly, flexible REPL environment.
Real-time graphics engineAOT compilerMaximum throughput and stable frame-times.
Cross-platform mobile or desktop appByte-code + optional JITShip once, run anywhere; performance improves with execution time.

Core Concepts in Language Execution (Interpreters vs Compilers)

TermDefinitionKey Technical DetailsCommon Student Misconception
Programming language executionThe process of transforming human-readable source code into machine-executable operations carried out by the CPU.Requires at least one translation stage (interpretation, compilation, or both). CPUs do not understand high-level syntax directly.Believing that CPUs “run” Python, Java, or C++ directly.
InterpreterA runtime system that reads source code or byte-code and executes instructions one at a time during program execution.Decoding and execution occur repeatedly inside a dispatch loop at run time.Thinking an interpreter translates the whole program before execution.
CompilerA system that translates an entire program into native machine code before execution begins.Translation cost is paid once at build time; output is a standalone executable.Assuming compiled programs are “not parsed” at all.
Hybrid runtimeA language implementation combining interpretation and compilation.Often interprets byte-code first, then just-in-time (JIT) compiles hot paths.Treating hybrid systems as either purely interpreted or compiled.

Lexical and Syntactic Analysis

TermDefinitionKey Technical DetailsCommon Student Misconception
Lexing (lexical analysis)The process of converting a raw character stream into a sequence of tokens.Tokens include identifiers, keywords, literals, operators, and delimiters. Whitespace and comments are typically discarded.Confusing lexing with parsing.
TokenA classified unit of meaning produced by the lexer.Example: int, x, =, 42, ; are distinct tokens.Thinking tokens still contain grammar structure.
Parsing (syntactic analysis)The process of analyzing a token stream according to a formal grammar.Produces a hierarchical structure representing grammatical relationships.Assuming parsing checks types or semantics.
Parse tree (concrete syntax tree)A tree representation that reflects the exact grammatical structure of the source code.Includes every grammar rule and syntactic detail, including parentheses and punctuation.Thinking parse trees are used directly for optimisation or execution.
GrammarA formal specification (often context-free) describing valid language structure.

Typically written in BNF or EBNF form.

Backus–Naur Form (BNF)
BNF is a formal notation used to describe the syntax of a programming language as a set of recursive production rules.

A BNF grammar defines:

  • Non-terminals: abstract syntactic categories (e.g. <expression>, <statement>)

  • Terminals: literal symbols or tokens (e.g. +, if, identifier)

  • Productions: rules showing how non-terminals expand into sequences of terminals and/or non-terminals

Believing grammar defines program meaning rather than structure.

Abstract Representation and Analysis

TermDefinitionKey Technical DetailsCommon Student Misconception
Abstract Syntax Tree (AST)A simplified tree representation of program structure that omits unnecessary syntactic detail.Preserves semantic meaning while discarding grammar artifacts like parentheses.Confusing ASTs with parse trees.
Front-end analysisThe compiler phase that processes source code into an AST and validates correctness.Includes lexing, parsing, name resolution, and type checking.Thinking optimisation happens here.
Type checkingVerification that operations are applied to compatible data types.Can be static (compile time) or dynamic (run time).Assuming all languages enforce types at compile time.

Execution and Translation Pipelines

TermDefinitionKey Technical DetailsCommon Student Misconception
Interpreter pipelineThe execution model used by interpreters to repeatedly decode and execute instructions.Lex → parse → dispatch → execute → update state → repeat.Thinking parsing happens once at program start.
Dispatch loopThe core execution loop of an interpreter or virtual machine.Fetches the next instruction, decodes it, and invokes the corresponding routine.Confusing it with CPU instruction dispatch.
Runtime decoding costThe overhead incurred by interpreters due to repeated instruction decoding.Paid every iteration of loops and function calls.Assuming interpreters are slow for all workloads.
Compiler pipelineThe staged process by which source code becomes native machine code.Front-end → IR → optimisation → code generation → linking.Thinking compilation skips intermediate representations.

Intermediate and Machine-Level Concepts

TermDefinitionKey Technical DetailsCommon Student Misconception
Intermediate Representation (IR)A machine-independent code form used internally by compilers.Enables optimisation and retargeting to different CPUs.Believing IR is executed directly by hardware.
OptimisationTransformations that improve performance or reduce resource usage without changing program behaviour.Includes dead-code elimination, constant folding, and loop vectorisation.Thinking optimisation changes program output.
Code generationThe process of converting IR into machine instructions.Target-specific; respects CPU architecture and ABI.Assuming one compiler output works on all systems.
LinkingThe final build step that combines object files and libraries into an executable.Resolves symbols and addresses across modules.Confusing linking with loading.

 

 

 

Take-away

  • Interpreters execute high-level constructs as they go—simpler to start, perfect for rapid iteration, but slower in the long run.
  • Compilers do the heavy work up front, paying translation cost once to deliver tight, predictable machine code.
  • Modern language runtimes often mix both ideas, compiling when it helps and interpreting when it keeps development nimble.

Understanding where translation effort lands—in the editor loop, at program start-up, or dynamically while the program runs—lets you pick the right tool chain for every project.