Hamming Distance

This article is not assessed by the IB but may be helpful to deepen your understanding. Plus, I think it's cool.

The Big Idea

Hamming distance is a measure of how many positions two strings of equal length differ in. This simple yet powerful idea is essential in error detection and correction, digital communication, and binary encoding schemes. One especially important application is in Gray code, a binary numeral system where successive values differ in only one bit—meaning their Hamming distance is exactly 1.

This minimal-distance transition is not an accident; Gray code was designed to reduce transmission errors and ambiguity in digital systems. Understanding Hamming distance is thus fundamental to understanding both error-resistant data encoding and specialized binary systems like Gray code.

 

What Is Hamming Distance?

Formally, the Hamming distance dH(x,y)d_H(x, y) between two binary strings xx and yy of equal length nn is:

dH(x,y)=i=1n[xiyi]d_H(x, y) = \sum_{i=1}^{n} [x_i \ne y_i]

That is, it counts the number of bit positions where the two strings differ.

 

Example: Basic Hamming Distance

Given:

  • x=101010x = 101010
  • y=100110y = 100110

Compare bit-by-bit:

PositionxyDifferent?
111No
200No
310Yes
401Yes
511No
600No

Hamming distance = 2 (positions 3 and 4 differ)

 

What Is Gray Code?

Gray code (or reflected binary code) is a binary numeral system where each successive number differs from the previous by exactly one bit. This is done to minimize Hamming distance between adjacent values—ensuring that only one bit changes at each step.

Why is this important?

In systems such as rotary encoders, digital sensors, and analog-to-digital converters, transitioning between values with minimal change is critical. If multiple bits changed at once (as in normal binary), partial updates could result in major errors. Gray code eliminates this by enforcing a Hamming distance of 1 between consecutive values.

 

Example: 3-bit Binary vs. Gray Code

DecimalBinaryGray CodeHamming Distance (Binary → Gray)
00000000
10010010
20100112
30110101
41001102
51011111
61101012
71111001

In Gray code, each successive value differs by a Hamming distance of 1:

  • 000 → 001 (distance 1)
  • 001 → 011 (distance 1)
  • 011 → 010 (distance 1)
  • 010 → 110 (distance 1)
  • and so on.

This consistency is what makes Gray code ideal for mechanical and electrical systems.

 

Relationship Between Gray Code and Hamming Distance

  1. Design principle: Gray code is specifically designed so that adjacent code words have a Hamming distance of 1.
  2. Error reduction: Reducing bit flips between states reduces the chance of transitional errors.
  3. Encoding/decoding:
    • To convert from binary to Gray:

      Grayi=BinaryiBinaryi+1\text{Gray}_i = \text{Binary}_i \oplus \text{Binary}_{i+1}

    • Decoding Gray to binary involves cumulative XOR operations.

 

Worked Example – IB Style

Command Term: Describe

Question:
Describe how the Hamming distance between successive values in Gray code improves reliability in a rotary encoder system.

Student Answer – Partially Correct:

"Gray code improves reliability because it has small differences between numbers."

Clarification:
The student identifies a key benefit, but lacks technical detail. A better answer includes Hamming distance and transition state clarity.

Better Answer:
Gray code ensures that each successive value differs by exactly one bit—meaning the Hamming distance is 1. This guarantees that even if a transition is read mid-change (e.g., due to electrical noise or mechanical lag), only one bit might be incorrect, reducing ambiguity and increasing robustness compared to standard binary, where multiple bits might change simultaneously.