B4.1.6 Explain the core principles of ADTs. (HL only)

B4.1.6 Explain the core principles of ADTs.

• High-level description of data structures and their associated operations and purpose

• The underlying mechanics of hash tables, including hashing functions, collision resolution strategies and load factors

• The underlying mechanics of sets to store and manage data

• HashMap and HashSet in Java; dict and set in Python

The big idea

An Abstract Data Type (ADT) defines what a data structure should do, not how it should do it.
It provides:

A high-level description of the structure.
The operations that can be performed.
A clear purpose for when and why to use it.

For example, a dictionary (Python) or HashMap (Java) is an ADT describing a mapping from keys to values. The ADT doesn’t say whether the mapping is implemented using a hash table, a balanced tree, or another method — just what the operations are (insert, lookup, delete, etc.).

1. High-level description of data structures

ADTs specify:

Data type (e.g., “collection of key–value pairs” for a map).
Operations (e.g., add, remove, contains, size).
Purpose (e.g., fast lookup, ordered storage, uniqueness).

Examples:

ADT	Purpose	Common Operations
Stack	Last-in-first-out storage	`push`, `pop`, `peek`
Queue	First-in-first-out storage	`enqueue`, `dequeue`, `peek`
Set	Unique unordered elements	`add`, `remove`, `contains`
Map	Key–value associations	`put`, `get`, `remove`

2. Underlying mechanics of hash tables

A hash table is a common implementation for sets and maps.
It stores elements in buckets indexed by a hash function applied to the key.

Hashing functions

Take a key (string, number, etc.) and return an integer (hash code).
Must distribute keys uniformly to avoid clustering.

Example in Python:

print(hash("apple"))  # Example output: -2667847081789992353

(Actual value varies by run.)

Collision resolution

When two keys hash to the same bucket:

Separate chaining: Store multiple items in a list at that bucket.
Open addressing: Find another empty bucket via probing (linear, quadratic, or double hashing).

Load factor

The ratio: number_of_elements / number_of_buckets
High load factor → more collisions → slower lookups.
Many hash tables resize (rehash) automatically when the load factor passes a threshold.

3. Underlying mechanics of sets

A set can be implemented on top of a hash table:

The keys are the set elements.
The values are irrelevant (or a constant placeholder).
Hashing ensures fast membership tests.

In Python:

s = {"apple", "banana", "cherry"}
print("apple" in s)  # True, O(1) average time

In Java:

Set<String> fruits = new HashSet<>();
fruits.add("apple");
System.out.println(fruits.contains("apple")); // true

4. HashMap / HashSet (Java) vs dict / set (Python)

Feature	Java HashMap / HashSet	Python dict / set
Key–value map	`HashMap<K,V>`	`dict`
Unique elements	`HashSet<E>`	`set`
Implementation	Hash table	Hash table
Null keys	One `null` key allowed in HashMap; `null` in sets	`None` allowed as key or value
Thread safety	Not thread-safe by default	Not thread-safe by default
Order guarantee	No order guarantee (unless `LinkedHashMap`)	Maintains insertion order (Python 3.7+)

5. Python examples

Dictionary (map-like ADT)

phone_book = {"Alice": "555-1234", "Bob": "555-5678"}
phone_book["Charlie"] = "555-9999"  # Insert
print(phone_book["Alice"])          # Lookup
del phone_book["Bob"]               # Delete

Set (set-like ADT)

unique_numbers = {1, 2, 3}
unique_numbers.add(4)
unique_numbers.remove(2)
print(3 in unique_numbers)  # Membership check

6. Why this matters

Understanding the core principles of ADTs means:

You can choose the right data structure for a task.
You can replace an implementation without breaking the rest of the program.
You can reason about performance from the ADT’s operations, without caring about low-level details.