A3.3.4 Construct calculations within a database using SQL’s aggregate functions. (HL only)
• Aggregate functions on grouped data to aid reporting and decision-making
• Aggregate commands: AVERAGE, COUNT, MAX, MIN, SUM
📚 You can find additional information related to these learning standards in the course companion pages 206 to 209.
The Big Idea
Relational databases are not just containers for storing data—they are analytical engines capable of summarizing, aggregating, and extracting insights directly from data through SQL aggregate functions. These functions operate on sets of rows rather than individual records, making them ideal for reporting, trend analysis, and decision support in large-scale systems.
In high-level (HL) database systems, aggregate functions are often paired with GROUP BY and HAVING clauses to enable multi-dimensional analysis across large datasets.
1. What Are Aggregate Functions?
Aggregate functions collapse a group of rows into a single value. They’re used to compute:
- Totals,
- Averages,
- Extremes (maximum, minimum),
- Counts.
These operations are non-scalar: they process a set of values to return a summary statistic.
2. Core SQL Aggregate Functions
| Function | Purpose | Example |
|---|---|---|
COUNT() | Number of rows | COUNT(*), COUNT(column_name) |
SUM() | Total of numeric values | SUM(order_amount) |
AVG() | Arithmetic mean | AVG(score) |
MAX() | Largest value in a column | MAX(price) |
MIN() | Smallest value in a column | MIN(salary) |
3. Basic Usage Without GROUP BY
Example: Total revenue from all orders
SELECT SUM(order_total) AS total_revenue
FROM orders;
Example: Average age of users
SELECT AVG(age) AS average_age
FROM users;
These queries return one row with an aggregate result across the entire table.
4. Grouped Aggregation Using GROUP BY
Use Case: Aggregate values per category or per entity.
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id;
This groups employees by department_id and calculates the average salary per department.
Query Execution Logic
- Partition rows into groups based on the
GROUP BYkey. - Apply the aggregate function within each group.
- Return one row per group.
5. Filtering Grouped Data Using HAVING
WHERE filters rows before grouping; HAVING filters after aggregation.
Example: Show only departments where average salary exceeds 70,000
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
HAVING AVG(salary) > 70000;
This returns only groups that meet a condition after aggregation.
6. Combining Aggregates and JOINs
In real applications, aggregates are used across joined tables to perform analytical queries.
Example: Count of students per course
SELECT courses.course_name, COUNT(enrollments.student_id) AS total_students
FROM courses
JOIN enrollments ON courses.course_id = enrollments.course_id
GROUP BY courses.course_name;
7. Advanced Example: Multi-Aggregate Report
Suppose you manage a bookstore and want a report per author with:
- total number of books,
- highest price,
- lowest price,
- average price.
SELECT author_id,
COUNT(*) AS book_count,
MAX(price) AS highest_price,
MIN(price) AS lowest_price,
AVG(price) AS average_price
FROM books
GROUP BY author_id;
This is a multi-aggregate query, returning a rich summary per author.
8. Performance Considerations
- Aggregates on indexed columns can be optimized, especially for
MIN()andMAX(). - Grouping on high-cardinality fields (e.g.,
user_id) can be expensive. - Efficient use of composite indexes and query planners is critical for large data sets.
In analytical systems (OLAP), aggregates may be precomputed using materialized views or data cubes to improve query performance.
Summary
SQL’s aggregate functions—COUNT, SUM, AVG, MAX, MIN—allow you to build reporting queries directly in the database, without exporting data to external tools. When used with GROUP BY and HAVING, these functions power essential business intelligence, such as:
- Revenue reports
- Performance metrics
- Usage statistics
- Anomaly detection
For HL-level database work, understanding how to structure grouped aggregate queries—especially in multi-table joins or large datasets—is key to writing efficient, meaningful, and scalable SQL analytics.