B4.1.6 Explain the core principles of ADTs. (HL only)

B4.1.6 Explain the core principles of ADTs. 
• High-level description of data structures and their associated operations and purpose 
• The underlying mechanics of hash tables, including hashing functions, collision resolution strategies and load factors 
• The underlying mechanics of sets to store and manage data 
• HashMap and HashSet in Java; dict and set in Python

The big idea

An Abstract Data Type (ADT) defines what a data structure should do, not how it should do it.
It provides:

  • A high-level description of the structure.
  • The operations that can be performed.
  • A clear purpose for when and why to use it.

For example, a dictionary (Python) or HashMap (Java) is an ADT describing a mapping from keys to values. The ADT doesn’t say whether the mapping is implemented using a hash table, a balanced tree, or another method — just what the operations are (insert, lookup, delete, etc.).

 

1. High-level description of data structures

ADTs specify:

  • Data type (e.g., “collection of key–value pairs” for a map).
  • Operations (e.g., add, remove, contains, size).
  • Purpose (e.g., fast lookup, ordered storage, uniqueness).

Examples:

ADTPurposeCommon Operations
StackLast-in-first-out storagepush, pop, peek
QueueFirst-in-first-out storageenqueue, dequeue, peek
SetUnique unordered elementsadd, remove, contains
MapKey–value associationsput, get, remove

 

2. Underlying mechanics of hash tables

A hash table is a common implementation for sets and maps.
It stores elements in buckets indexed by a hash function applied to the key.

Hashing functions

  • Take a key (string, number, etc.) and return an integer (hash code).
  • Must distribute keys uniformly to avoid clustering.

Example in Python:

print(hash("apple"))  # Example output: -2667847081789992353

(Actual value varies by run.)

Collision resolution

When two keys hash to the same bucket:

  • Separate chaining: Store multiple items in a list at that bucket.
  • Open addressing: Find another empty bucket via probing (linear, quadratic, or double hashing).

Load factor

  • The ratio: number_of_elements / number_of_buckets
  • High load factor → more collisions → slower lookups.
  • Many hash tables resize (rehash) automatically when the load factor passes a threshold.

 

3. Underlying mechanics of sets

A set can be implemented on top of a hash table:

  • The keys are the set elements.
  • The values are irrelevant (or a constant placeholder).
  • Hashing ensures fast membership tests.

In Python:

s = {"apple", "banana", "cherry"}
print("apple" in s)  # True, O(1) average time

In Java:

Set<String> fruits = new HashSet<>();
fruits.add("apple");
System.out.println(fruits.contains("apple")); // true

 

4. HashMap / HashSet (Java) vs dict / set (Python)

FeatureJava HashMap / HashSetPython dict / set
Key–value mapHashMap<K,V>dict
Unique elementsHashSet<E>set
ImplementationHash tableHash table
Null keysOne null key allowed in HashMap; null in setsNone allowed as key or value
Thread safetyNot thread-safe by defaultNot thread-safe by default
Order guaranteeNo order guarantee (unless LinkedHashMap)Maintains insertion order (Python 3.7+)

 

5. Python examples

Dictionary (map-like ADT)

phone_book = {"Alice": "555-1234", "Bob": "555-5678"}
phone_book["Charlie"] = "555-9999"  # Insert
print(phone_book["Alice"])          # Lookup
del phone_book["Bob"]               # Delete

Set (set-like ADT)

unique_numbers = {1, 2, 3}
unique_numbers.add(4)
unique_numbers.remove(2)
print(3 in unique_numbers)  # Membership check

 

6. Why this matters

Understanding the core principles of ADTs means:

  • You can choose the right data structure for a task.
  • You can replace an implementation without breaking the rest of the program.
  • You can reason about performance from the ADT’s operations, without caring about low-level details.