COUNT, SUM, AVG, MIN, MAX โ then GROUP BY and HAVING to slice data any way you want.
Throughout this entire lesson we'll work with a single orders table that tracks purchases in an online store. Think of it like the receipt drawer at a shop โ every sale gets a row.
Imagine you run a lemonade stand. Every time somebody buys something, you write it down on a piece of paper: who bought it, what they bought, how many, and how much they paid. At the end of the day you have a long list. That list is our orders table! Later in this lesson we'll learn how to answer questions like "How much money did I make today?" or "Which flavour sold the most?" โ all from that one list.
Run the following SQL to create and populate the table:
-- Step 1: Create the orders table CREATE TABLE orders ( id SERIAL PRIMARY KEY, customer VARCHAR(100), product VARCHAR(100), category VARCHAR(50), amount DECIMAL(10,2), quantity INTEGER, order_date DATE ); -- Step 2: Insert 18 sample orders INSERT INTO orders (customer, product, category, amount, quantity, order_date) VALUES ('Alice', 'Laptop', 'Electronics', 1200.00, 1, '2024-01-15'), ('Alice', 'Wireless Mouse', 'Electronics', 25.99, 2, '2024-01-15'), ('Alice', 'Python Book', 'Books', 39.99, 1, '2024-02-10'), ('Bob', 'Headphones', 'Electronics', 79.99, 1, '2024-01-20'), ('Bob', 'Running Shoes', 'Clothing', 89.99, 1, '2024-02-05'), ('Bob', 'SQL Book', 'Books', 29.99, 1, '2024-02-14'), ('Bob', 'T-Shirt', 'Clothing', 19.99, 3, '2024-03-01'), ('Charlie', 'Keyboard', 'Electronics', 149.99, 1, '2024-01-25'), ('Charlie', 'Monitor', 'Electronics', 349.99, 2, '2024-02-18'), ('Charlie', 'Desk Lamp', 'Home', 45.00, 1, '2024-03-10'), ('Diana', 'Yoga Mat', 'Clothing', 29.99, 1, '2024-01-08'), ('Diana', 'Water Bottle', 'Home', 15.00, 4, '2024-02-22'), ('Diana', 'Backpack', 'Clothing', 59.99, 1, '2024-03-05'), ('Diana', 'Coffee Beans', 'Food', 14.99, 2, '2024-03-12'), ('Eve', 'Tablet', 'Electronics', 499.99, 1, '2024-02-01'), ('Eve', 'Phone Case', 'Electronics', 12.99, 2, '2024-02-01'), ('Eve', 'Cookbook', 'Books', NULL, 1, '2024-03-15'), ('Alice', 'Sneakers', 'Clothing', 75.00, 1, '2024-03-20');
Here's what our data looks like:
| id | customer | product | category | amount | qty | order_date |
|---|---|---|---|---|---|---|
| 1 | Alice | Laptop | Electronics | 1200.00 | 1 | 2024-01-15 |
| 2 | Alice | Wireless Mouse | Electronics | 25.99 | 2 | 2024-01-15 |
| 3 | Alice | Python Book | Books | 39.99 | 1 | 2024-02-10 |
| 4 | Bob | Headphones | Electronics | 79.99 | 1 | 2024-01-20 |
| 5 | Bob | Running Shoes | Clothing | 89.99 | 1 | 2024-02-05 |
| 6 | Bob | SQL Book | Books | 29.99 | 1 | 2024-02-14 |
| 7 | Bob | T-Shirt | Clothing | 19.99 | 3 | 2024-03-01 |
| 8 | Charlie | Keyboard | Electronics | 149.99 | 1 | 2024-01-25 |
| 9 | Charlie | Monitor | Electronics | 349.99 | 2 | 2024-02-18 |
| 10 | Charlie | Desk Lamp | Home | 45.00 | 1 | 2024-03-10 |
| 11 | Diana | Yoga Mat | Clothing | 29.99 | 1 | 2024-01-08 |
| 12 | Diana | Water Bottle | Home | 15.00 | 4 | 2024-02-22 |
| 13 | Diana | Backpack | Clothing | 59.99 | 1 | 2024-03-05 |
| 14 | Diana | Coffee Beans | Food | 14.99 | 2 | 2024-03-12 |
| 15 | Eve | Tablet | Electronics | 499.99 | 1 | 2024-02-01 |
| 16 | Eve | Phone Case | Electronics | 12.99 | 2 | 2024-02-01 |
| 17 | Eve | Cookbook | Books | NULL | 1 | 2024-03-15 |
| 18 | Alice | Sneakers | Clothing | 75.00 | 1 | 2024-03-20 |
Order #17 (the Cookbook) has a NULL amount โ maybe it was a free promotional copy or the price hasn't been entered yet. We deliberately included this because aggregate functions treat NULLs in surprising ways. Keep an eye on it throughout the lesson!
Your teacher asks: "How many students are in this class?" You answer "30". That's a summary โ one number that describes the whole group. Now your teacher asks: "Tell me every student's name." Now you have to read 30 names. That's a list. Regular SELECT gives you the list. Aggregate functions give you the summary. They crunch many rows down into a single answer.
PostgreSQL has five core aggregate functions. Think of them as five super-powers for summarising data:
| Function | What It Does | Layman Translation |
|---|---|---|
COUNT() | Counts rows | "How many?" |
SUM() | Adds up values | "What's the total?" |
AVG() | Calculates the mean | "What's the average?" |
MIN() | Finds the smallest value | "What's the lowest?" |
MAX() | Finds the largest value | "What's the highest?" |
Imagine a basket of apples. COUNT = "How many apples are in the basket?" SUM = "What's the total weight of all the apples?" AVG = "What's the average weight per apple?" MIN = "Which apple weighs the least?" MAX = "Which apple weighs the most?" One basket, five completely different questions โ and aggregate functions answer all of them.
Let's start with the simplest one and build up.
COUNT is probably the aggregate function you'll use the most. It answers one simple question: "How many?"
The asterisk * means "everything." COUNT(*) counts every row in the result set, including rows with NULLs.
-- How many orders do we have in total? SELECT COUNT(*) AS total_orders FROM orders;
Result:
| total_orders |
|---|
| 18 |
When you give COUNT a specific column name, it counts only the rows where that column is not NULL.
-- How many orders have an amount recorded? SELECT COUNT(amount) AS orders_with_amount FROM orders;
Result:
| orders_with_amount |
|---|
| 17 |
We get 17 instead of 18 because order #17 (the Cookbook) has a NULL amount. COUNT(amount) silently skips it!
-- How many unique customers have placed orders? SELECT COUNT(DISTINCT customer) AS unique_customers FROM orders;
Result:
| unique_customers |
|---|
| 5 |
Even though there are 18 rows, only 5 different customers placed those orders: Alice, Bob, Charlie, Diana, and Eve.
-- How many different categories do we sell? SELECT COUNT(DISTINCT category) AS category_count FROM orders;
Result:
| category_count |
|---|
| 5 |
Five categories: Electronics, Books, Clothing, Home, and Food.
COUNT(*) counts all rows, including those with NULLs. COUNT(column) counts only rows where that column is not NULL. This difference trips up beginners constantly. When in doubt, use COUNT(*) to count rows, and COUNT(column) only when you intentionally want to skip NULLs.
-- See all three COUNTs side by side SELECT COUNT(*) AS all_rows, COUNT(amount) AS rows_with_amount, COUNT(DISTINCT customer) AS unique_customers FROM orders;
Result:
| all_rows | rows_with_amount | unique_customers |
|---|---|---|
| 18 | 17 | 5 |
You and your friends each bring some candy to a party. SUM answers: "If we dump all our candy into one pile, how many pieces do we have?" AVG answers: "If we split all the candy equally among everyone, how many pieces does each person get?" SUM = the whole pile. AVG = each person's fair share.
-- What is the total revenue from all orders? SELECT SUM(amount) AS total_revenue FROM orders;
Result:
| total_revenue |
|---|
| 2738.84 |
-- How many total items were ordered? (sum of all quantities) SELECT SUM(quantity) AS total_items FROM orders;
Result:
| total_items |
|---|
| 28 |
-- What is the average order amount? SELECT AVG(amount) AS avg_order_value FROM orders;
Result:
| avg_order_value |
|---|
| 161.11 |
The Cookbook order has NULL for amount. AVG silently skips it! It calculates 2738.84 รท 17 (not 18). This means AVG gives you the average of known values, not the average across all rows. If you want NULLs treated as 0, wrap the column: AVG(COALESCE(amount, 0)).
-- AVG with NULLs treated as 0 SELECT AVG(amount) AS avg_ignoring_nulls, AVG(COALESCE(amount, 0)) AS avg_treating_null_as_zero FROM orders;
Result:
| avg_ignoring_nulls | avg_treating_null_as_zero |
|---|---|
| 161.11 | 152.16 |
See the difference? The second number is lower because it divides by 18 instead of 17.
AVG often produces long decimal numbers. Wrap it with ROUND() for clean reports: ROUND(AVG(amount), 2) gives you exactly 2 decimal places. Your boss will thank you when the report doesn't show 161.108235294117647.
-- Clean, rounded averages SELECT ROUND(AVG(amount), 2) AS avg_order, ROUND(AVG(quantity), 1) AS avg_items FROM orders;
Result:
| avg_order | avg_items |
|---|---|
| 161.11 | 1.6 |
The gym teacher lines everyone up and asks: "Who is the tallest?" and "Who is the shortest?" That's MIN and MAX! They scan through all the values and pick out the absolute smallest and the absolute biggest. And here's the cool part โ they work on numbers, dates, and even text!
-- Cheapest and most expensive orders SELECT MIN(amount) AS cheapest_order, MAX(amount) AS biggest_order FROM orders;
Result:
| cheapest_order | biggest_order |
|---|---|
| 12.99 | 1200.00 |
Dates are compared chronologically โ earlier dates are "smaller."
-- When was the first and last order placed? SELECT MIN(order_date) AS first_order, MAX(order_date) AS latest_order FROM orders;
Result:
| first_order | latest_order |
|---|---|
| 2024-01-08 | 2024-03-20 |
Text is compared alphabetically (A comes before Z, like a dictionary).
-- Alphabetically first and last customer names SELECT MIN(customer) AS first_alphabetically, MAX(customer) AS last_alphabetically FROM orders;
Result:
| first_alphabetically | last_alphabetically |
|---|---|
| Alice | Eve |
-- The complete summary dashboard SELECT COUNT(*) AS total_orders, SUM(amount) AS total_revenue, ROUND(AVG(amount), 2) AS avg_order_value, MIN(amount) AS smallest_order, MAX(amount) AS largest_order FROM orders;
Result:
| total_orders | total_revenue | avg_order_value | smallest_order | largest_order |
|---|---|---|---|---|
| 18 | 2738.84 | 161.11 | 12.99 | 1200.00 |
This is exactly what you'd see on a store manager's daily dashboard. Every morning they glance at one screen: "How many orders came in? What's total revenue? What's the average order? What's our cheapest sale? What was the biggest sale?" Five numbers. Five aggregate functions. One glance and you know how the business is doing.
You dump a big bag of M&Ms onto the table. Right now they're all mixed together. You separate them into piles: red pile, blue pile, green pile, yellow pile. Then you count each pile separately. "I have 12 reds, 8 blues, 15 greens, and 10 yellows." That's GROUP BY! You're splitting the data into groups and running an aggregate function on each group separately.
Without GROUP BY, aggregate functions look at all rows at once and return a single number. With GROUP BY, they look at each group independently and return one row per group.
-- How many orders in each category? SELECT category, COUNT(*) AS order_count FROM orders GROUP BY category ORDER BY order_count DESC;
Result:
| category | order_count |
|---|---|
| Electronics | 7 |
| Clothing | 5 |
| Books | 3 |
| Home | 2 |
| Food | 1 |
PostgreSQL split the 18 rows into 5 groups (one per category), counted each pile, and gave us a nice summary.
-- Total spending in each category SELECT category, SUM(amount) AS total_revenue, ROUND(AVG(amount), 2) AS avg_order_value, COUNT(*) AS order_count FROM orders GROUP BY category ORDER BY total_revenue DESC;
Result:
| category | total_revenue | avg_order_value | order_count |
|---|---|---|---|
| Electronics | 2318.94 | 331.28 | 7 |
| Clothing | 274.96 | 54.99 | 5 |
| Books | 69.98 | 34.99 | 3 |
| Home | 60.00 | 30.00 | 2 |
| Food | 14.99 | 14.99 | 1 |
-- How much has each customer spent? SELECT customer, COUNT(*) AS total_orders, SUM(amount) AS total_spent, ROUND(AVG(amount), 2) AS avg_order FROM orders GROUP BY customer ORDER BY total_spent DESC;
Result:
| customer | total_orders | total_spent | avg_order |
|---|---|---|---|
| Alice | 4 | 1340.98 | 335.25 |
| Charlie | 3 | 544.98 | 181.66 |
| Eve | 3 | 512.98 | 256.49 |
| Bob | 4 | 219.96 | 54.99 |
| Diana | 4 | 119.97 | 29.99 |
-- Monthly order summary SELECT TO_CHAR(order_date, 'YYYY-MM') AS month, COUNT(*) AS order_count, SUM(amount) AS revenue FROM orders GROUP BY TO_CHAR(order_date, 'YYYY-MM') ORDER BY month;
Result:
| month | order_count | revenue |
|---|---|---|
| 2024-01 | 5 | 1485.96 |
| 2024-02 | 6 | 944.94 |
| 2024-03 | 7 | 259.96 |
Every column in your SELECT must either be inside an aggregate function (COUNT, SUM, AVG, etc.) or listed in GROUP BY. If you write SELECT customer, product, COUNT(*) FROM orders GROUP BY customer, PostgreSQL will throw an error because product isn't aggregated or grouped. Think of it this way: if there are 4 orders for Alice, which one product should PostgreSQL show? It doesn't know โ so it complains.
-- Spending per customer per category SELECT customer, category, COUNT(*) AS order_count, SUM(amount) AS total_spent FROM orders GROUP BY customer, category ORDER BY customer, total_spent DESC;
Result (first rows):
| customer | category | order_count | total_spent |
|---|---|---|---|
| Alice | Electronics | 2 | 1225.99 |
| Alice | Clothing | 1 | 75.00 |
| Alice | Books | 1 | 39.99 |
| Bob | Clothing | 2 | 109.98 |
| Bob | Electronics | 1 | 79.99 |
| ... more rows ... | ... | ... | ... |
Grouping by multiple columns is like a school report card. You don't just want "average grade across all subjects" โ you want it broken down by student AND subject. "Alice got 92 in Math, 88 in English. Bob got 75 in Math, 91 in English." Each unique combination of student + subject gets its own row.
You're the teacher dividing students into study groups. WHERE is like saying "Only students taller than 5 feet can participate" โ you filter individual people before making groups. HAVING is like saying "After the groups are formed, only keep groups with more than 3 members" โ you filter entire groups after they've been created. WHERE filters rows. HAVING filters groups.
-- Which customers have placed 4 or more orders? SELECT customer, COUNT(*) AS order_count FROM orders GROUP BY customer HAVING COUNT(*) >= 4 ORDER BY order_count DESC;
Result:
| customer | order_count |
|---|---|
| Alice | 4 |
| Bob | 4 |
| Diana | 4 |
Charlie and Eve only have 3 orders each, so HAVING filtered them out.
-- Which categories generated more than $100 total? SELECT category, SUM(amount) AS total_revenue FROM orders GROUP BY category HAVING SUM(amount) > 100 ORDER BY total_revenue DESC;
Result:
| category | total_revenue |
|---|---|
| Electronics | 2318.94 |
| Clothing | 274.96 |
Books ($69.98), Home ($60.00), and Food ($14.99) all fell below the $100 threshold.
This is where the real power lives โ filter rows first, then filter the resulting groups.
-- Among orders placed in 2024 Q1 (Jan-Mar), -- which customers spent more than $200 total? SELECT customer, SUM(amount) AS q1_spending FROM orders WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31' GROUP BY customer HAVING SUM(amount) > 200 ORDER BY q1_spending DESC;
Result:
| customer | q1_spending |
|---|---|
| Alice | 1340.98 |
| Charlie | 544.98 |
| Eve | 512.98 |
| Bob | 219.96 |
This is wrong and will throw an error:
-- โ WRONG โ You can't use WHERE with aggregate functions! SELECT customer, COUNT(*) FROM orders WHERE COUNT(*) >= 4 -- ERROR! GROUP BY customer; -- โ CORRECT โ Use HAVING for aggregate conditions SELECT customer, COUNT(*) FROM orders GROUP BY customer HAVING COUNT(*) >= 4; -- Works!
Remember: WHERE runs before grouping โ at that point, the groups don't exist yet, so there's nothing to count. HAVING runs after grouping, when the counts are available.
We touched on COUNT(DISTINCT ...) earlier, but let's explore DISTINCT more fully inside aggregate functions.
-- List all unique categories SELECT DISTINCT category FROM orders ORDER BY category;
Result:
| category |
|---|
| Books |
| Clothing |
| Electronics |
| Food |
| Home |
-- For each category: how many unique customers ordered? SELECT category, COUNT(*) AS total_orders, COUNT(DISTINCT customer) AS unique_customers FROM orders GROUP BY category ORDER BY unique_customers DESC;
Result:
| category | total_orders | unique_customers |
|---|---|---|
| Electronics | 7 | 4 |
| Clothing | 5 | 3 |
| Books | 3 | 3 |
| Home | 2 | 2 |
| Food | 1 | 1 |
Electronics has 7 orders but only 4 unique buyers โ so some customers ordered electronics more than once.
You can use DISTINCT with SUM too: SUM(DISTINCT amount) would add up only unique amounts, skipping duplicates. This is rarely useful in practice, but it's good to know it exists. The most common combo by far is COUNT(DISTINCT column).
Now let's put all the pieces together. We'll build a real business question step by step.
"Find the top 3 customers by total spending, who have ordered more than 3 times, sorted by total amount descending."
SELECT customer, COUNT(*) AS total_orders, SUM(amount) AS total_spent FROM orders GROUP BY customer;
SELECT customer, COUNT(*) AS total_orders, SUM(amount) AS total_spent FROM orders GROUP BY customer HAVING COUNT(*) > 3;
SELECT customer, COUNT(*) AS total_orders, SUM(amount) AS total_spent FROM orders GROUP BY customer HAVING COUNT(*) > 3 ORDER BY total_spent DESC LIMIT 3;
Final Result:
| customer | total_orders | total_spent |
|---|---|---|
| Alice | 4 | 1340.98 |
| Bob | 4 | 219.96 |
| Diana | 4 | 119.97 |
This is how a marketing team finds their VIP customers. They don't just want the biggest spenders โ they want people who are consistently buying (more than 3 orders) AND spending the most. That's how you decide who gets the loyalty discount or the thank-you email.
Let's solve some questions that come up in real businesses every day. These are the kinds of queries you'll write in your job.
-- Monthly revenue with running comparison SELECT TO_CHAR(order_date, 'Month YYYY') AS month_name, COUNT(*) AS orders, SUM(quantity) AS items_sold, SUM(amount) AS revenue, ROUND(AVG(amount), 2) AS avg_order_value FROM orders WHERE amount IS NOT NULL GROUP BY TO_CHAR(order_date, 'Month YYYY'), DATE_TRUNC('month', order_date) ORDER BY DATE_TRUNC('month', order_date);
Result:
| month_name | orders | items_sold | revenue | avg_order_value |
|---|---|---|---|---|
| January 2024 | 5 | 6 | 1485.96 | 297.19 |
| February 2024 | 6 | 8 | 944.94 | 157.49 |
| March 2024 | 6 | 9 | 259.96 | 43.33 |
-- Which products sell the most units? SELECT product, SUM(quantity) AS total_units_sold, SUM(amount) AS total_revenue FROM orders WHERE amount IS NOT NULL GROUP BY product ORDER BY total_units_sold DESC LIMIT 5;
Result:
| product | total_units_sold | total_revenue |
|---|---|---|
| Water Bottle | 4 | 15.00 |
| T-Shirt | 3 | 19.99 |
| Wireless Mouse | 2 | 25.99 |
| Monitor | 2 | 349.99 |
| Phone Case | 2 | 12.99 |
-- Full customer profile: spending, frequency, date range SELECT customer, COUNT(*) AS total_orders, SUM(amount) AS lifetime_value, ROUND(AVG(amount), 2) AS avg_order_value, MIN(order_date) AS first_order, MAX(order_date) AS last_order, MAX(order_date) - MIN(order_date) AS days_active, COUNT(DISTINCT category) AS categories_shopped FROM orders GROUP BY customer ORDER BY lifetime_value DESC;
Result:
| customer | orders | lifetime_value | avg_order | first | last | days | categories |
|---|---|---|---|---|---|---|---|
| Alice | 4 | 1340.98 | 335.25 | Jan 15 | Mar 20 | 65 | 3 |
| Charlie | 3 | 544.98 | 181.66 | Jan 25 | Mar 10 | 45 | 2 |
| Eve | 3 | 512.98 | 256.49 | Feb 01 | Mar 15 | 43 | 2 |
| Bob | 4 | 219.96 | 54.99 | Jan 20 | Mar 01 | 41 | 3 |
| Diana | 4 | 119.97 | 29.99 | Jan 08 | Mar 12 | 64 | 3 |
This CLV query is exactly what companies like Amazon, Netflix, and Spotify run every day. They want to know: "Which customers are most valuable? Who buys frequently? Who shops across many categories?" This single query gives you a full customer profile you could put on a dashboard.
Time to practice! Solve these yourself before peeking at the answers. Use the orders table from the beginning.
Write a query that returns the total revenue (sum of all amounts) and the total number of items sold (sum of all quantities).
SELECT SUM(amount) AS total_revenue, SUM(quantity) AS total_items_sold FROM orders;
Returns: total_revenue = 2738.84, total_items_sold = 28
For each category, show the number of orders, total revenue, and average order value (rounded to 2 decimal places). Sort by total revenue descending.
SELECT category, COUNT(*) AS order_count, SUM(amount) AS total_revenue, ROUND(AVG(amount), 2) AS avg_order_value FROM orders GROUP BY category ORDER BY total_revenue DESC;
Find customers who have spent more than $500 in total. Show their name, total orders, and total spent.
SELECT customer, COUNT(*) AS total_orders, SUM(amount) AS total_spent FROM orders GROUP BY customer HAVING SUM(amount) > 500 ORDER BY total_spent DESC;
Returns: Alice ($1340.98), Charlie ($544.98), Eve ($512.98)
For each category, count the number of unique customers who ordered. Only include categories with at least 2 unique buyers.
SELECT category, COUNT(DISTINCT customer) AS unique_buyers FROM orders GROUP BY category HAVING COUNT(DISTINCT customer) >= 2 ORDER BY unique_buyers DESC;
Returns: Electronics (4), Clothing (3), Books (3), Home (2)
Show the total revenue per customer for orders placed only in February 2024. Sort by revenue descending.
SELECT customer, SUM(amount) AS feb_revenue FROM orders WHERE order_date BETWEEN '2024-02-01' AND '2024-02-29' GROUP BY customer ORDER BY feb_revenue DESC;
Create a comprehensive category report showing: category name, total orders, total revenue, average order value (rounded), min order, max order, and number of unique customers. Only include categories with total revenue over $50. Sort by revenue descending.
SELECT category, COUNT(*) AS total_orders, SUM(amount) AS total_revenue, ROUND(AVG(amount), 2) AS avg_order, MIN(amount) AS min_order, MAX(amount) AS max_order, COUNT(DISTINCT customer) AS unique_customers FROM orders GROUP BY category HAVING SUM(amount) > 50 ORDER BY total_revenue DESC;
COUNT(*) and COUNT(column)?AVG do with NULL values?COUNT(DISTINCT category) return for our orders table?You've just unlocked one of the most important skill sets in SQL โ the ability to summarise data instead of just listing it. Here's what you conquered:
| Concept | What It Does | Key Syntax |
|---|---|---|
| COUNT | Counts rows | COUNT(*), COUNT(col), COUNT(DISTINCT col) |
| SUM | Adds up values | SUM(amount) |
| AVG | Calculates the mean | ROUND(AVG(amount), 2) |
| MIN / MAX | Finds extremes | MIN(date), MAX(price) |
| GROUP BY | Splits into groups | GROUP BY category |
| HAVING | Filters groups | HAVING COUNT(*) > 3 |
| DISTINCT | Removes duplicates | COUNT(DISTINCT customer) |
So far, all our data has lived in one table. But real databases have many tables โ customers in one, orders in another, products in a third. In the next module you'll learn how to combine tables using JOINs. It's like connecting puzzle pieces to see the full picture. Imagine being able to say: "Show me each customer's name, their order details, AND the product reviews โ all from three different tables in one query." That's the power of JOINs!