Aggregations & Grouping | PostgreSQL Mastery

📦 Our Practice Dataset — The Orders Table

Throughout this entire lesson we'll work with a single orders table that tracks purchases in an online store. Think of it like the receipt drawer at a shop — every sale gets a row.

🧒 ELI5 — Why One Table for Everything?

Imagine you run a lemonade stand. Every time somebody buys something, you write it down on a piece of paper: who bought it, what they bought, how many, and how much they paid. At the end of the day you have a long list. That list is our orders table! Later in this lesson we'll learn how to answer questions like "How much money did I make today?" or "Which flavour sold the most?" — all from that one list.

Run the following SQL to create and populate the table:

-- Step 1: Create the orders table
CREATE TABLE orders (
    id          SERIAL PRIMARY KEY,
    customer    VARCHAR(100),
    product     VARCHAR(100),
    category    VARCHAR(50),
    amount      DECIMAL(10,2),
    quantity    INTEGER,
    order_date  DATE
);

-- Step 2: Insert 18 sample orders
INSERT INTO orders (customer, product, category, amount, quantity, order_date) VALUES
    ('Alice',   'Laptop',          'Electronics', 1200.00, 1,  '2024-01-15'),
    ('Alice',   'Wireless Mouse',   'Electronics', 25.99,   2,  '2024-01-15'),
    ('Alice',   'Python Book',      'Books',       39.99,   1,  '2024-02-10'),
    ('Bob',     'Headphones',       'Electronics', 79.99,   1,  '2024-01-20'),
    ('Bob',     'Running Shoes',    'Clothing',    89.99,   1,  '2024-02-05'),
    ('Bob',     'SQL Book',         'Books',       29.99,   1,  '2024-02-14'),
    ('Bob',     'T-Shirt',          'Clothing',    19.99,   3,  '2024-03-01'),
    ('Charlie', 'Keyboard',         'Electronics', 149.99,  1,  '2024-01-25'),
    ('Charlie', 'Monitor',          'Electronics', 349.99,  2,  '2024-02-18'),
    ('Charlie', 'Desk Lamp',        'Home',        45.00,   1,  '2024-03-10'),
    ('Diana',   'Yoga Mat',         'Clothing',    29.99,   1,  '2024-01-08'),
    ('Diana',   'Water Bottle',     'Home',        15.00,   4,  '2024-02-22'),
    ('Diana',   'Backpack',         'Clothing',    59.99,   1,  '2024-03-05'),
    ('Diana',   'Coffee Beans',     'Food',        14.99,   2,  '2024-03-12'),
    ('Eve',     'Tablet',           'Electronics', 499.99,  1,  '2024-02-01'),
    ('Eve',     'Phone Case',       'Electronics', 12.99,   2,  '2024-02-01'),
    ('Eve',     'Cookbook',          'Books',       NULL,    1,  '2024-03-15'),
    ('Alice',   'Sneakers',         'Clothing',    75.00,   1,  '2024-03-20');

Here's what our data looks like:

id	customer	product	category	amount	qty	order_date
1	Alice	Laptop	Electronics	1200.00	1	2024-01-15
2	Alice	Wireless Mouse	Electronics	25.99	2	2024-01-15
3	Alice	Python Book	Books	39.99	1	2024-02-10
4	Bob	Headphones	Electronics	79.99	1	2024-01-20
5	Bob	Running Shoes	Clothing	89.99	1	2024-02-05
6	Bob	SQL Book	Books	29.99	1	2024-02-14
7	Bob	T-Shirt	Clothing	19.99	3	2024-03-01
8	Charlie	Keyboard	Electronics	149.99	1	2024-01-25
9	Charlie	Monitor	Electronics	349.99	2	2024-02-18
10	Charlie	Desk Lamp	Home	45.00	1	2024-03-10
11	Diana	Yoga Mat	Clothing	29.99	1	2024-01-08
12	Diana	Water Bottle	Home	15.00	4	2024-02-22
13	Diana	Backpack	Clothing	59.99	1	2024-03-05
14	Diana	Coffee Beans	Food	14.99	2	2024-03-12
15	Eve	Tablet	Electronics	499.99	1	2024-02-01
16	Eve	Phone Case	Electronics	12.99	2	2024-02-01
17	Eve	Cookbook	Books	NULL	1	2024-03-15
18	Alice	Sneakers	Clothing	75.00	1	2024-03-20

💡 Pro Tip — Notice the NULL!

Order #17 (the Cookbook) has a NULL amount — maybe it was a free promotional copy or the price hasn't been entered yet. We deliberately included this because aggregate functions treat NULLs in surprising ways. Keep an eye on it throughout the lesson!

1️⃣ What Are Aggregate Functions?

🧒 ELI5 — Summarising vs. Listing

Your teacher asks: "How many students are in this class?" You answer "30". That's a summary — one number that describes the whole group. Now your teacher asks: "Tell me every student's name." Now you have to read 30 names. That's a list. Regular SELECT gives you the list. Aggregate functions give you the summary. They crunch many rows down into a single answer.

PostgreSQL has five core aggregate functions. Think of them as five super-powers for summarising data:

Function	What It Does	Layman Translation
`COUNT()`	Counts rows	"How many?"
`SUM()`	Adds up values	"What's the total?"
`AVG()`	Calculates the mean	"What's the average?"
`MIN()`	Finds the smallest value	"What's the lowest?"
`MAX()`	Finds the largest value	"What's the highest?"

🌍 Real-Life Analogy

Imagine a basket of apples. COUNT = "How many apples are in the basket?" SUM = "What's the total weight of all the apples?" AVG = "What's the average weight per apple?" MIN = "Which apple weighs the least?" MAX = "Which apple weighs the most?" One basket, five completely different questions — and aggregate functions answer all of them.

Let's start with the simplest one and build up.

2️⃣ COUNT — How Many Rows?

COUNT is probably the aggregate function you'll use the most. It answers one simple question: "How many?"

COUNT(*) — Count All Rows

The asterisk * means "everything." COUNT(*) counts every row in the result set, including rows with NULLs.

-- How many orders do we have in total?
SELECT COUNT(*) AS total_orders
FROM orders;

Result:

total_orders
18

COUNT(column) — Count Non-NULL Values

When you give COUNT a specific column name, it counts only the rows where that column is not NULL.

-- How many orders have an amount recorded?
SELECT COUNT(amount) AS orders_with_amount
FROM orders;

Result:

orders_with_amount
17

We get 17 instead of 18 because order #17 (the Cookbook) has a NULL amount. COUNT(amount) silently skips it!

COUNT(DISTINCT column) — Count Unique Values

-- How many unique customers have placed orders?
SELECT COUNT(DISTINCT customer) AS unique_customers
FROM orders;

Result:

unique_customers
5

Even though there are 18 rows, only 5 different customers placed those orders: Alice, Bob, Charlie, Diana, and Eve.

-- How many different categories do we sell?
SELECT COUNT(DISTINCT category) AS category_count
FROM orders;

Result:

category_count
5

Five categories: Electronics, Books, Clothing, Home, and Food.

⚠️ COUNT(*) vs COUNT(column) — The Classic Trap!

COUNT(*) counts all rows, including those with NULLs. COUNT(column) counts only rows where that column is not NULL. This difference trips up beginners constantly. When in doubt, use COUNT(*) to count rows, and COUNT(column) only when you intentionally want to skip NULLs.

Quick Comparison

-- See all three COUNTs side by side
SELECT
    COUNT(*)               AS all_rows,
    COUNT(amount)           AS rows_with_amount,
    COUNT(DISTINCT customer) AS unique_customers
FROM orders;

Result:

all_rows	rows_with_amount	unique_customers
18	17	5

3️⃣ SUM & AVG — Totals and Averages

🧒 ELI5 — Adding Up and Sharing Equally

You and your friends each bring some candy to a party. SUM answers: "If we dump all our candy into one pile, how many pieces do we have?" AVG answers: "If we split all the candy equally among everyone, how many pieces does each person get?" SUM = the whole pile. AVG = each person's fair share.

SUM — What's the Total?

-- What is the total revenue from all orders?
SELECT SUM(amount) AS total_revenue
FROM orders;

Result:

total_revenue
2738.84

-- How many total items were ordered? (sum of all quantities)
SELECT SUM(quantity) AS total_items
FROM orders;

Result:

total_items
28

AVG — What's the Average?

-- What is the average order amount?
SELECT AVG(amount) AS avg_order_value
FROM orders;

Result:

avg_order_value
161.11

⚠️ AVG Ignores NULLs — This Can Surprise You!

The Cookbook order has NULL for amount. AVG silently skips it! It calculates 2738.84 ÷ 17 (not 18). This means AVG gives you the average of known values, not the average across all rows. If you want NULLs treated as 0, wrap the column: AVG(COALESCE(amount, 0)).

-- AVG with NULLs treated as 0
SELECT
    AVG(amount)                      AS avg_ignoring_nulls,
    AVG(COALESCE(amount, 0))         AS avg_treating_null_as_zero
FROM orders;

Result:

avg_ignoring_nulls	avg_treating_null_as_zero
161.11	152.16

See the difference? The second number is lower because it divides by 18 instead of 17.

💡 Pro Tip — ROUND for Clean Output

AVG often produces long decimal numbers. Wrap it with ROUND() for clean reports: ROUND(AVG(amount), 2) gives you exactly 2 decimal places. Your boss will thank you when the report doesn't show 161.108235294117647.

-- Clean, rounded averages
SELECT
    ROUND(AVG(amount), 2) AS avg_order,
    ROUND(AVG(quantity), 1) AS avg_items
FROM orders;

Result:

avg_order	avg_items
161.11	1.6

4️⃣ MIN & MAX — Finding Extremes

🧒 ELI5 — The Tallest and Shortest Kid

The gym teacher lines everyone up and asks: "Who is the tallest?" and "Who is the shortest?" That's MIN and MAX! They scan through all the values and pick out the absolute smallest and the absolute biggest. And here's the cool part — they work on numbers, dates, and even text!

MIN & MAX on Numbers

-- Cheapest and most expensive orders
SELECT
    MIN(amount) AS cheapest_order,
    MAX(amount) AS biggest_order
FROM orders;

Result:

cheapest_order	biggest_order
12.99	1200.00

MIN & MAX on Dates

Dates are compared chronologically — earlier dates are "smaller."

-- When was the first and last order placed?
SELECT
    MIN(order_date) AS first_order,
    MAX(order_date) AS latest_order
FROM orders;

Result:

first_order	latest_order
2024-01-08	2024-03-20

MIN & MAX on Text

Text is compared alphabetically (A comes before Z, like a dictionary).

-- Alphabetically first and last customer names
SELECT
    MIN(customer) AS first_alphabetically,
    MAX(customer) AS last_alphabetically
FROM orders;

Result:

first_alphabetically	last_alphabetically
Alice	Eve

Combining All Five Aggregates in One Query

-- The complete summary dashboard
SELECT
    COUNT(*)              AS total_orders,
    SUM(amount)            AS total_revenue,
    ROUND(AVG(amount), 2) AS avg_order_value,
    MIN(amount)            AS smallest_order,
    MAX(amount)            AS largest_order
FROM orders;

Result:

total_orders	total_revenue	avg_order_value	smallest_order	largest_order
18	2738.84	161.11	12.99	1200.00

🌍 Real-Life Analogy

This is exactly what you'd see on a store manager's daily dashboard. Every morning they glance at one screen: "How many orders came in? What's total revenue? What's the average order? What's our cheapest sale? What was the biggest sale?" Five numbers. Five aggregate functions. One glance and you know how the business is doing.

5️⃣ GROUP BY — The Magic of Summarising by Category

🧒 ELI5 — Sorting M&Ms by Colour

You dump a big bag of M&Ms onto the table. Right now they're all mixed together. You separate them into piles: red pile, blue pile, green pile, yellow pile. Then you count each pile separately. "I have 12 reds, 8 blues, 15 greens, and 10 yellows." That's GROUP BY! You're splitting the data into groups and running an aggregate function on each group separately.

Without GROUP BY, aggregate functions look at all rows at once and return a single number. With GROUP BY, they look at each group independently and return one row per group.

Example 1: Orders per Category

-- How many orders in each category?
SELECT
    category,
    COUNT(*) AS order_count
FROM orders
GROUP BY category
ORDER BY order_count DESC;

Result:

category	order_count
Electronics	7
Clothing	5
Books	3
Home	2
Food	1

PostgreSQL split the 18 rows into 5 groups (one per category), counted each pile, and gave us a nice summary.

Example 2: Revenue per Category

-- Total spending in each category
SELECT
    category,
    SUM(amount) AS total_revenue,
    ROUND(AVG(amount), 2) AS avg_order_value,
    COUNT(*) AS order_count
FROM orders
GROUP BY category
ORDER BY total_revenue DESC;

Result:

category	total_revenue	avg_order_value	order_count
Electronics	2318.94	331.28	7
Clothing	274.96	54.99	5
Books	69.98	34.99	3
Home	60.00	30.00	2
Food	14.99	14.99	1

Example 3: Orders per Customer

-- How much has each customer spent?
SELECT
    customer,
    COUNT(*) AS total_orders,
    SUM(amount) AS total_spent,
    ROUND(AVG(amount), 2) AS avg_order
FROM orders
GROUP BY customer
ORDER BY total_spent DESC;

Result:

customer	total_orders	total_spent	avg_order
Alice	4	1340.98	335.25
Charlie	3	544.98	181.66
Eve	3	512.98	256.49
Bob	4	219.96	54.99
Diana	4	119.97	29.99

Example 4: Orders per Month

-- Monthly order summary
SELECT
    TO_CHAR(order_date, 'YYYY-MM') AS month,
    COUNT(*) AS order_count,
    SUM(amount) AS revenue
FROM orders
GROUP BY TO_CHAR(order_date, 'YYYY-MM')
ORDER BY month;

Result:

month	order_count	revenue
2024-01	5	1485.96
2024-02	6	944.94
2024-03	7	259.96

⚠️ The Golden Rule of GROUP BY

Every column in your SELECT must either be inside an aggregate function (COUNT, SUM, AVG, etc.) or listed in GROUP BY. If you write SELECT customer, product, COUNT(*) FROM orders GROUP BY customer, PostgreSQL will throw an error because product isn't aggregated or grouped. Think of it this way: if there are 4 orders for Alice, which one product should PostgreSQL show? It doesn't know — so it complains.

Grouping by Multiple Columns

-- Spending per customer per category
SELECT
    customer,
    category,
    COUNT(*) AS order_count,
    SUM(amount) AS total_spent
FROM orders
GROUP BY customer, category
ORDER BY customer, total_spent DESC;

Result (first rows):

customer	category	order_count	total_spent
Alice	Electronics	2	1225.99
Alice	Clothing	1	75.00
Alice	Books	1	39.99
Bob	Clothing	2	109.98
Bob	Electronics	1	79.99
... more rows ...	...	...	...

🌍 Real-Life Analogy

Grouping by multiple columns is like a school report card. You don't just want "average grade across all subjects" — you want it broken down by student AND subject. "Alice got 92 in Math, 88 in English. Bob got 75 in Math, 91 in English." Each unique combination of student + subject gets its own row.

6️⃣ HAVING — Filtering Groups

🧒 ELI5 — WHERE vs. HAVING

You're the teacher dividing students into study groups. WHERE is like saying "Only students taller than 5 feet can participate" — you filter individual people before making groups. HAVING is like saying "After the groups are formed, only keep groups with more than 3 members" — you filter entire groups after they've been created. WHERE filters rows. HAVING filters groups.

The Execution Order

1

FROM — Pick the table

2

WHERE — Filter individual rows

3

GROUP BY — Form groups

4

HAVING — Filter groups

5

SELECT — Pick columns

6

ORDER BY — Sort results

Example 1: Customers With 4+ Orders

-- Which customers have placed 4 or more orders?
SELECT
    customer,
    COUNT(*) AS order_count
FROM orders
GROUP BY customer
HAVING COUNT(*) >= 4
ORDER BY order_count DESC;

Result:

customer	order_count
Alice	4
Bob	4
Diana	4

Charlie and Eve only have 3 orders each, so HAVING filtered them out.

Example 2: Categories With Over $100 in Revenue

-- Which categories generated more than $100 total?
SELECT
    category,
    SUM(amount) AS total_revenue
FROM orders
GROUP BY category
HAVING SUM(amount) > 100
ORDER BY total_revenue DESC;

Result:

category	total_revenue
Electronics	2318.94
Clothing	274.96

Books ($69.98), Home ($60.00), and Food ($14.99) all fell below the $100 threshold.

Example 3: WHERE + HAVING Together

This is where the real power lives — filter rows first, then filter the resulting groups.

-- Among orders placed in 2024 Q1 (Jan-Mar),
-- which customers spent more than $200 total?
SELECT
    customer,
    SUM(amount) AS q1_spending
FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31'
GROUP BY customer
HAVING SUM(amount) > 200
ORDER BY q1_spending DESC;

Result:

customer	q1_spending
Alice	1340.98
Charlie	544.98
Eve	512.98
Bob	219.96

⚠️ Common Mistake — Using WHERE With Aggregates!

This is wrong and will throw an error:

-- ❌ WRONG — You can't use WHERE with aggregate functions!
SELECT customer, COUNT(*)
FROM orders
WHERE COUNT(*) >= 4   -- ERROR!
GROUP BY customer;

-- ✅ CORRECT — Use HAVING for aggregate conditions
SELECT customer, COUNT(*)
FROM orders
GROUP BY customer
HAVING COUNT(*) >= 4;  -- Works!

Remember: WHERE runs before grouping — at that point, the groups don't exist yet, so there's nothing to count. HAVING runs after grouping, when the counts are available.

7️⃣ DISTINCT — Removing Duplicates

We touched on COUNT(DISTINCT ...) earlier, but let's explore DISTINCT more fully inside aggregate functions.

-- List all unique categories
SELECT DISTINCT category
FROM orders
ORDER BY category;

Result:

category
Books
Clothing
Electronics
Food
Home

DISTINCT Inside Aggregates

-- For each category: how many unique customers ordered?
SELECT
    category,
    COUNT(*) AS total_orders,
    COUNT(DISTINCT customer) AS unique_customers
FROM orders
GROUP BY category
ORDER BY unique_customers DESC;

Result:

category	total_orders	unique_customers
Electronics	7	4
Clothing	5	3
Books	3	3
Home	2	2
Food	1	1

Electronics has 7 orders but only 4 unique buyers — so some customers ordered electronics more than once.

💡 Pro Tip — SUM(DISTINCT ...) Exists But Is Rare

You can use DISTINCT with SUM too: SUM(DISTINCT amount) would add up only unique amounts, skipping duplicates. This is rarely useful in practice, but it's good to know it exists. The most common combo by far is COUNT(DISTINCT column).

8️⃣ Combining Everything — Building Complex Queries

Now let's put all the pieces together. We'll build a real business question step by step.

The Question

"Find the top 3 customers by total spending, who have ordered more than 3 times, sorted by total amount descending."

Step 1: Start with GROUP BY to get per-customer totals

SELECT
    customer,
    COUNT(*) AS total_orders,
    SUM(amount) AS total_spent
FROM orders
GROUP BY customer;

Step 2: Add HAVING to keep only customers with 3+ orders

SELECT
    customer,
    COUNT(*) AS total_orders,
    SUM(amount) AS total_spent
FROM orders
GROUP BY customer
HAVING COUNT(*) > 3;

Step 3: Add ORDER BY and LIMIT for the final result

SELECT
    customer,
    COUNT(*) AS total_orders,
    SUM(amount) AS total_spent
FROM orders
GROUP BY customer
HAVING COUNT(*) > 3
ORDER BY total_spent DESC
LIMIT 3;

Final Result:

customer	total_orders	total_spent
Alice	4	1340.98
Bob	4	219.96
Diana	4	119.97

🌍 Real-Life Analogy

This is how a marketing team finds their VIP customers. They don't just want the biggest spenders — they want people who are consistently buying (more than 3 orders) AND spending the most. That's how you decide who gets the loyalty discount or the thank-you email.

9️⃣ Real-World Analytics Examples

Let's solve some questions that come up in real businesses every day. These are the kinds of queries you'll write in your job.

Monthly Revenue Report

-- Monthly revenue with running comparison
SELECT
    TO_CHAR(order_date, 'Month YYYY') AS month_name,
    COUNT(*)                             AS orders,
    SUM(quantity)                         AS items_sold,
    SUM(amount)                           AS revenue,
    ROUND(AVG(amount), 2)                AS avg_order_value
FROM orders
WHERE amount IS NOT NULL
GROUP BY TO_CHAR(order_date, 'Month YYYY'),
         DATE_TRUNC('month', order_date)
ORDER BY DATE_TRUNC('month', order_date);

Result:

month_name	orders	items_sold	revenue	avg_order_value
January 2024	5	6	1485.96	297.19
February 2024	6	8	944.94	157.49
March 2024	6	9	259.96	43.33

Most Popular Products (by Quantity Sold)

-- Which products sell the most units?
SELECT
    product,
    SUM(quantity) AS total_units_sold,
    SUM(amount)   AS total_revenue
FROM orders
WHERE amount IS NOT NULL
GROUP BY product
ORDER BY total_units_sold DESC
LIMIT 5;

Result:

product	total_units_sold	total_revenue
Water Bottle	4	15.00
T-Shirt	3	19.99
Wireless Mouse	2	25.99
Monitor	2	349.99
Phone Case	2	12.99

Customer Lifetime Value (CLV)

-- Full customer profile: spending, frequency, date range
SELECT
    customer,
    COUNT(*) AS total_orders,
    SUM(amount) AS lifetime_value,
    ROUND(AVG(amount), 2) AS avg_order_value,
    MIN(order_date) AS first_order,
    MAX(order_date) AS last_order,
    MAX(order_date) - MIN(order_date) AS days_active,
    COUNT(DISTINCT category) AS categories_shopped
FROM orders
GROUP BY customer
ORDER BY lifetime_value DESC;

Result:

customer	orders	lifetime_value	avg_order	first	last	days	categories
Alice	4	1340.98	335.25	Jan 15	Mar 20	65	3
Charlie	3	544.98	181.66	Jan 25	Mar 10	45	2
Eve	3	512.98	256.49	Feb 01	Mar 15	43	2
Bob	4	219.96	54.99	Jan 20	Mar 01	41	3
Diana	4	119.97	29.99	Jan 08	Mar 12	64	3

🌍 Real-Life Analogy

This CLV query is exactly what companies like Amazon, Netflix, and Spotify run every day. They want to know: "Which customers are most valuable? Who buys frequently? Who shops across many categories?" This single query gives you a full customer profile you could put on a dashboard.

🏋️ Practice Exercises

Time to practice! Solve these yourself before peeking at the answers. Use the orders table from the beginning.

Exercise 1: Total Revenue Calculator

Write a query that returns the total revenue (sum of all amounts) and the total number of items sold (sum of all quantities).

Reveal Solution

SELECT
    SUM(amount) AS total_revenue,
    SUM(quantity) AS total_items_sold
FROM orders;

Returns: total_revenue = 2738.84, total_items_sold = 28

Exercise 2: Category Breakdown

For each category, show the number of orders, total revenue, and average order value (rounded to 2 decimal places). Sort by total revenue descending.

Reveal Solution

SELECT
    category,
    COUNT(*) AS order_count,
    SUM(amount) AS total_revenue,
    ROUND(AVG(amount), 2) AS avg_order_value
FROM orders
GROUP BY category
ORDER BY total_revenue DESC;

Exercise 3: Big Spenders Only

Find customers who have spent more than $500 in total. Show their name, total orders, and total spent.

Reveal Solution

SELECT
    customer,
    COUNT(*) AS total_orders,
    SUM(amount) AS total_spent
FROM orders
GROUP BY customer
HAVING SUM(amount) > 500
ORDER BY total_spent DESC;

Returns: Alice ($1340.98), Charlie ($544.98), Eve ($512.98)

Exercise 4: Unique Buyer Report

For each category, count the number of unique customers who ordered. Only include categories with at least 2 unique buyers.

Reveal Solution

SELECT
    category,
    COUNT(DISTINCT customer) AS unique_buyers
FROM orders
GROUP BY category
HAVING COUNT(DISTINCT customer) >= 2
ORDER BY unique_buyers DESC;

Returns: Electronics (4), Clothing (3), Books (3), Home (2)

Exercise 5: February Only

Show the total revenue per customer for orders placed only in February 2024. Sort by revenue descending.

Reveal Solution

SELECT
    customer,
    SUM(amount) AS feb_revenue
FROM orders
WHERE order_date BETWEEN '2024-02-01' AND '2024-02-29'
GROUP BY customer
ORDER BY feb_revenue DESC;

Exercise 6: The Full Dashboard (Challenge!)

Create a comprehensive category report showing: category name, total orders, total revenue, average order value (rounded), min order, max order, and number of unique customers. Only include categories with total revenue over $50. Sort by revenue descending.

Reveal Solution

SELECT
    category,
    COUNT(*) AS total_orders,
    SUM(amount) AS total_revenue,
    ROUND(AVG(amount), 2) AS avg_order,
    MIN(amount) AS min_order,
    MAX(amount) AS max_order,
    COUNT(DISTINCT customer) AS unique_customers
FROM orders
GROUP BY category
HAVING SUM(amount) > 50
ORDER BY total_revenue DESC;

📝 Quiz — Test Your Knowledge!

Question 1: What's the difference between `COUNT(*)` and `COUNT(column)`?

Question 2: What does `AVG` do with NULL values?

Question 3: When do you use HAVING instead of WHERE?

Question 4: What happens if you SELECT a column that isn't in GROUP BY and isn't inside an aggregate?

Question 5: What is the correct SQL order of execution?

Question 6: Which query finds categories with more than 3 orders?

Question 7: What does `COUNT(DISTINCT category)` return for our orders table?

🎉 What's Next?

5

Aggregate functions mastered

20+

SQL examples practiced

7

Quiz questions conquered

You've just unlocked one of the most important skill sets in SQL — the ability to summarise data instead of just listing it. Here's what you conquered:

Concept	What It Does	Key Syntax
COUNT	Counts rows	`COUNT(*), COUNT(col), COUNT(DISTINCT col)`
SUM	Adds up values	`SUM(amount)`
AVG	Calculates the mean	`ROUND(AVG(amount), 2)`
MIN / MAX	Finds extremes	`MIN(date), MAX(price)`
GROUP BY	Splits into groups	`GROUP BY category`
HAVING	Filters groups	`HAVING COUNT(*) > 3`
DISTINCT	Removes duplicates	`COUNT(DISTINCT customer)`

🔮 Coming Up Next: JOINs

So far, all our data has lived in one table. But real databases have many tables — customers in one, orders in another, products in a third. In the next module you'll learn how to combine tables using JOINs. It's like connecting puzzle pieces to see the full picture. Imagine being able to say: "Show me each customer's name, their order details, AND the product reviews — all from three different tables in one query." That's the power of JOINs!

📊 Aggregations & Grouping

📦 Our Practice Dataset — The Orders Table

🧒 ELI5 — Why One Table for Everything?

💡 Pro Tip — Notice the NULL!

1️⃣ What Are Aggregate Functions?

🧒 ELI5 — Summarising vs. Listing

🌍 Real-Life Analogy

2️⃣ COUNT — How Many Rows?

COUNT(*) — Count All Rows

COUNT(column) — Count Non-NULL Values

COUNT(DISTINCT column) — Count Unique Values

⚠️ COUNT(*) vs COUNT(column) — The Classic Trap!

Quick Comparison

3️⃣ SUM & AVG — Totals and Averages

🧒 ELI5 — Adding Up and Sharing Equally

SUM — What's the Total?

AVG — What's the Average?

⚠️ AVG Ignores NULLs — This Can Surprise You!

💡 Pro Tip — ROUND for Clean Output

4️⃣ MIN & MAX — Finding Extremes

🧒 ELI5 — The Tallest and Shortest Kid

MIN & MAX on Numbers

MIN & MAX on Dates

MIN & MAX on Text

Combining All Five Aggregates in One Query

🌍 Real-Life Analogy

5️⃣ GROUP BY — The Magic of Summarising by Category

🧒 ELI5 — Sorting M&Ms by Colour

Example 1: Orders per Category

Example 2: Revenue per Category

Example 3: Orders per Customer

Example 4: Orders per Month

⚠️ The Golden Rule of GROUP BY

Grouping by Multiple Columns

🌍 Real-Life Analogy

6️⃣ HAVING — Filtering Groups

🧒 ELI5 — WHERE vs. HAVING

The Execution Order

Example 1: Customers With 4+ Orders

Example 2: Categories With Over $100 in Revenue

Example 3: WHERE + HAVING Together

⚠️ Common Mistake — Using WHERE With Aggregates!

7️⃣ DISTINCT — Removing Duplicates

DISTINCT Inside Aggregates

💡 Pro Tip — SUM(DISTINCT ...) Exists But Is Rare

8️⃣ Combining Everything — Building Complex Queries

The Question

Step 1: Start with GROUP BY to get per-customer totals

Step 2: Add HAVING to keep only customers with 3+ orders

Step 3: Add ORDER BY and LIMIT for the final result

🌍 Real-Life Analogy

9️⃣ Real-World Analytics Examples

Monthly Revenue Report

Most Popular Products (by Quantity Sold)

Customer Lifetime Value (CLV)

🌍 Real-Life Analogy

🏋️ Practice Exercises

Exercise 1: Total Revenue Calculator

Exercise 2: Category Breakdown

Exercise 3: Big Spenders Only

Exercise 4: Unique Buyer Report

Exercise 5: February Only

Exercise 6: The Full Dashboard (Challenge!)

📝 Quiz — Test Your Knowledge!

Question 1: What's the difference between COUNT(*) and COUNT(column)?

Question 2: What does AVG do with NULL values?

Question 3: When do you use HAVING instead of WHERE?

Question 4: What happens if you SELECT a column that isn't in GROUP BY and isn't inside an aggregate?

Question 5: What is the correct SQL order of execution?

Question 6: Which query finds categories with more than 3 orders?

Question 7: What does COUNT(DISTINCT category) return for our orders table?

🎉 What's Next?

🔮 Coming Up Next: JOINs

Question 1: What's the difference between `COUNT(*)` and `COUNT(column)`?

Question 2: What does `AVG` do with NULL values?

Question 7: What does `COUNT(DISTINCT category)` return for our orders table?