B4.1.6 Explain the core principles of ADTs.
• High-level description of data structures and their associated operations and purpose
• The underlying mechanics of hash tables, including hashing functions, collision resolution strategies and load factors
• The underlying mechanics of sets to store and manage data
• HashMap and HashSet in Java; dict and set in Python
The big idea
An Abstract Data Type (ADT) defines what a data structure should do, not how it should do it.
It provides:
- A high-level description of the structure.
- The operations that can be performed.
- A clear purpose for when and why to use it.
For example, a dictionary (Python) or HashMap (Java) is an ADT describing a mapping from keys to values. The ADT doesn’t say whether the mapping is implemented using a hash table, a balanced tree, or another method — just what the operations are (insert, lookup, delete, etc.).
1. High-level description of data structures
ADTs specify:
- Data type (e.g., “collection of key–value pairs” for a map).
- Operations (e.g.,
add,remove,contains,size). - Purpose (e.g., fast lookup, ordered storage, uniqueness).
Examples:
| ADT | Purpose | Common Operations |
|---|---|---|
| Stack | Last-in-first-out storage | push, pop, peek |
| Queue | First-in-first-out storage | enqueue, dequeue, peek |
| Set | Unique unordered elements | add, remove, contains |
| Map | Key–value associations | put, get, remove |
2. Underlying mechanics of hash tables
A hash table is a common implementation for sets and maps.
It stores elements in buckets indexed by a hash function applied to the key.
Hashing functions
- Take a key (string, number, etc.) and return an integer (hash code).
- Must distribute keys uniformly to avoid clustering.
Example in Python:
print(hash("apple")) # Example output: -2667847081789992353
(Actual value varies by run.)
Collision resolution
When two keys hash to the same bucket:
- Separate chaining: Store multiple items in a list at that bucket.
- Open addressing: Find another empty bucket via probing (linear, quadratic, or double hashing).
Load factor
- The ratio:
number_of_elements / number_of_buckets - High load factor → more collisions → slower lookups.
- Many hash tables resize (rehash) automatically when the load factor passes a threshold.
3. Underlying mechanics of sets
A set can be implemented on top of a hash table:
- The keys are the set elements.
- The values are irrelevant (or a constant placeholder).
- Hashing ensures fast membership tests.
In Python:
s = {"apple", "banana", "cherry"}
print("apple" in s) # True, O(1) average time
In Java:
Set<String> fruits = new HashSet<>();
fruits.add("apple");
System.out.println(fruits.contains("apple")); // true
4. HashMap / HashSet (Java) vs dict / set (Python)
| Feature | Java HashMap / HashSet | Python dict / set |
|---|---|---|
| Key–value map | HashMap<K,V> | dict |
| Unique elements | HashSet<E> | set |
| Implementation | Hash table | Hash table |
| Null keys | One null key allowed in HashMap; null in sets | None allowed as key or value |
| Thread safety | Not thread-safe by default | Not thread-safe by default |
| Order guarantee | No order guarantee (unless LinkedHashMap) | Maintains insertion order (Python 3.7+) |
5. Python examples
Dictionary (map-like ADT)
phone_book = {"Alice": "555-1234", "Bob": "555-5678"}
phone_book["Charlie"] = "555-9999" # Insert
print(phone_book["Alice"]) # Lookup
del phone_book["Bob"] # Delete
Set (set-like ADT)
unique_numbers = {1, 2, 3}
unique_numbers.add(4)
unique_numbers.remove(2)
print(3 in unique_numbers) # Membership check
6. Why this matters
Understanding the core principles of ADTs means:
- You can choose the right data structure for a task.
- You can replace an implementation without breaking the rest of the program.
- You can reason about performance from the ADT’s operations, without caring about low-level details.