B2.4.2 Construct and trace algorithms to implement a linear search and a binary search for data retrieval.
• The differences in efficiency between different methods of linear and binary search
• Use of search technique based on efficiency requirements—for example, searching a database for a sorted/indexed list of names to find a phone number, versus searching by the number to identify the name
📚 You can find additional information in the course companion pages 380 to 385
The Big Idea
Searching is one of the most fundamental operations in computer science—used to retrieve information from collections such as lists, arrays, databases, and trees. Two of the most well-known algorithms are linear search and binary search. These differ in how they operate, what assumptions they make about data, and most importantly, how efficiently they scale as the size of data increases.
1. Linear Search
Definition
Linear search (also known as sequential search) checks each element one by one until the target is found or the end of the list is reached.
Time Complexity
- Best case:
O(1)— target is the first element. - Average case:
O(n/2)≈O(n) - Worst case:
O(n)— target is at the end or not in the list.
When to Use
- When the list is unsorted.
- When data is small or when sorting is too costly.
- When the cost of a miss is low.
Python Example
def linear_search(arr, target):
for index in range(len(arr)):
if arr[index] == target:
return index
return -1
Java Example
public static int linearSearch(int[] arr, int target) {
for (int i = 0; i < arr.length; i++) {
if (arr[i] == target) {
return i;
}
}
return -1;
}
2. Binary Search
Definition
Binary search works by repeatedly dividing the sorted list in half, comparing the middle value to the target, and narrowing the search to the correct half.
Time Complexity
- Best case:
O(1)— target is at the midpoint - Average case:
O(log n) - Worst case:
O(log n)
Requires the input array to be sorted in advance.
When to Use
- When the data is sorted or indexed.
- When the dataset is large and performance is critical.
- Common in systems with millions of records, such as databases, file systems, or search engines.
Python Example
def binary_search(arr, target):
low = 0
high = len(arr) - 1
while low <= high:
mid = (low + high) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1
Java Example
public static int binarySearch(int[] arr, int target) {
int low = 0, high = arr.length - 1;
while (low <= high) {
int mid = (low + high) / 2;
if (arr[mid] == target)
return mid;
else if (arr[mid] < target)
low = mid + 1;
else
high = mid - 1;
}
return -1;
}
3. Tracing the Algorithms
Linear Search Trace Example
List: [4, 7, 2, 9, 5], Target: 9
| Step | Index Checked | Value | Match? |
|---|---|---|---|
| 1 | 0 | 4 | No |
| 2 | 1 | 7 | No |
| 3 | 2 | 2 | No |
| 4 | 3 | 9 | Yes |
Result: Found at index 3
Binary Search Trace Example
List: [1, 3, 5, 7, 9, 11, 13], Target: 9
| Step | low | high | mid | arr[mid] | Match? | Next step |
|---|---|---|---|---|---|---|
| 1 | 0 | 6 | 3 | 7 | No | 9 > 7 → search right half |
| 2 | 4 | 6 | 5 | 11 | No | 9 < 11 → search left half |
| 3 | 4 | 4 | 4 | 9 | Yes | Done |
Result: Found at index 4
4. Efficiency Comparison
| Criterion | Linear Search | Binary Search |
|---|---|---|
| Time complexity | O(n) | O(log n) |
| Data requirement | Any list | Must be sorted |
| Memory usage | Constant O(1) | Constant O(1) |
| Code simplicity | Very simple | Slightly more complex |
| Use case size | Small datasets | Large datasets |
| Predictability | Linear performance | Performance varies by position |
5. Choosing the Right Algorithm
Example Use Case 1: Searching a Phone Number by Name
- Data: Sorted list of names
- Goal: Find a name quickly
- Use: Binary search — fast lookup on sorted list
Example Use Case 2: Searching a Name by Phone Number (Unsorted)
- Data: No sort on phone numbers
- Goal: Identify caller
- Use: Linear search — no assumption about order
6. Summary
| Feature | Linear Search | Binary Search |
|---|---|---|
| Time complexity | O(n) | O(log n) |
| Sorting requirement | None | Must be pre-sorted |
| Typical use case | Small or unsorted datasets | Large and sorted datasets |
| Implementation | Simple | Requires careful logic |
| Traceable behavior | Predictable and exhaustive | Logarithmic narrowing of search space |
Final Thoughts
Choosing between linear and binary search is about balancing simplicity and performance. Linear search is easy to write and useful for small or unsorted data, while binary search is exponentially faster for large, sorted datasets. A solid grasp of both—along with how to trace and analyze them—equips you to build efficient and scalable search mechanisms in your programs.
import random
def generate_sorted_unique_list(size, lower_bound=0, upper_bound=10_000_000):
"""
Generate a sorted list of unique random integers.
:param size: Number of unique random integers to generate.
:param lower_bound: Minimum possible integer value.
:param upper_bound: Maximum possible integer value.
:return: Sorted list of unique integers.
"""
if size > (upper_bound - lower_bound + 1):
raise ValueError("Range too small for the requested number of unique values.")
numbers = random.sample(range(lower_bound, upper_bound + 1), size)
numbers.sort()
return numbers
if __name__ == "__main__":
SIZE = 4_000_000 # four million unique numbers
print("Generating list of unique random numbers...")
sorted_numbers = generate_sorted_unique_list(SIZE)
print("List generated and sorted!")
# Quick sanity check
print("First 10 numbers:", sorted_numbers[:10])
print("Last 10 numbers:", sorted_numbers[-10:])