Take your database to the real world! VACUUM, WAL, backup, replication, security, and performance tuning.
Imagine you live with a roommate who never throws anything away. Every time they update their shopping list, they don’t erase the old one — they write a brand new list and leave the old one on the table. After a few months, you have thousands of old shopping lists piling up everywhere.
That’s PostgreSQL with MVCC! Remember from Module 11: when you UPDATE or DELETE a row, the old version isn’t immediately removed. It stays there as a “dead tuple” (dead row) taking up space. Over time, these dead rows pile up, making your tables bloated and slow.
VACUUM is the spring cleaning service. It walks through the table, identifies dead rows that no active transaction can see anymore, and marks that space as reusable. AUTOVACUUM is the robot vacuum that runs automatically in the background so you don’t have to clean manually.
-- See how many dead tuples (dead rows) each table has SELECT relname AS table_name, n_live_tup AS live_rows, n_dead_tup AS dead_rows, ROUND(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 2) AS dead_pct, last_vacuum, last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup DESC; -- Result: -- table_name | live_rows | dead_rows | dead_pct | last_vacuum | last_autovacuum -- orders | 1000000 | 250000 | 20.00 | NULL | 2025-03-18 03:00 -- products | 50000 | 1200 | 2.34 | NULL | 2025-03-18 03:15
If a table has more than 10-20% dead rows, autovacuum might not be keeping up. This causes table bloat (wasted disk space) and slower queries (PostgreSQL has to skip over dead rows). Check your autovacuum settings!
-- Regular VACUUM: marks dead space as reusable, doesn't lock the table VACUUM orders; -- VACUUM VERBOSE: shows detailed stats about what it did VACUUM VERBOSE orders; -- INFO: "orders": removed 250000 dead row versions in 1847 pages -- INFO: "orders": found 250000 removable, 1000000 nonremovable row versions -- VACUUM ANALYZE: vacuum + update statistics (helps query planner) VACUUM ANALYZE orders;
| Feature | VACUUM (Regular) | VACUUM FULL |
|---|---|---|
| Locks the table? | No (runs alongside queries) | Yes! (exclusive lock) |
| Reclaims disk space to OS? | No (only marks reusable) | Yes (rewrites entire table) |
| Speed | Fast | Slow (rewrites everything) |
| Downtime required? | No | Effectively yes |
| When to use | Routine maintenance | Emergency: severe bloat only |
Use regular VACUUM 99.9% of the time (or just let autovacuum handle it). Only use VACUUM FULL as a last resort when a table has extreme bloat (e.g., after a massive DELETE of millions of rows) and you need to reclaim disk space to the operating system. It’s like tearing down your house and rebuilding it vs. just tidying up the rooms.
-- Check if autovacuum is enabled (it should be!) SHOW autovacuum; -- Result: on -- Key autovacuum settings SHOW autovacuum_vacuum_threshold; -- Default: 50 (minimum dead rows before vacuum kicks in) SHOW autovacuum_vacuum_scale_factor; -- Default: 0.2 (vacuum when 20% of rows are dead) -- The formula: vacuum triggers when dead rows > threshold + (scale_factor × total rows) -- For a 1M-row table: 50 + (0.2 × 1,000,000) = 200,050 dead rows triggers vacuum -- For high-write tables, you can tune per-table: ALTER TABLE orders SET ( autovacuum_vacuum_scale_factor = 0.05, -- trigger at 5% dead autovacuum_vacuum_threshold = 1000 -- minimum 1000 dead rows );
Some people disable autovacuum to “improve performance.” This is a terrible idea. Without autovacuum, dead rows pile up endlessly, your tables bloat, queries slow to a crawl, and eventually you might hit transaction ID wraparound — a catastrophic situation where PostgreSQL refuses to accept any new writes until you vacuum. Just leave it on and tune the settings if needed.
You know the “black box” on an airplane? It records every single thing that happens during a flight. If the plane crashes, investigators can replay the black box recording to understand exactly what happened and reconstruct the sequence of events.
The Write-Ahead Log (WAL) is PostgreSQL’s black box. Before PostgreSQL writes any change to the actual data files on disk, it first writes a record of that change to the WAL. This is the “write-ahead” part — the log entry is written ahead of (before) the actual data change.
If PostgreSQL crashes, it simply replays the WAL from the last checkpoint to recover all committed transactions. No data loss. It’s what makes the “D” in ACID (Durability) possible.
UPDATE accounts SET balance = 500 WHERE id = 1;PostgreSQL receives your command and figures out what data needs to change.
A description of the change (“set row 1 balance to 500”) is appended to the WAL file on disk. This is a sequential write — very fast!
The actual table data is modified in RAM. The disk version of the table is NOT updated yet — it’s now “dirty” (out of sync with disk).
PostgreSQL tells the client “COMMIT successful” as soon as the WAL record is safely on disk. Even if a crash happens now, the WAL has the record.
Later, the bgwriter and checkpointer processes write the dirty data from RAM to the actual table files on disk. This happens at their own pace.
-- Check the current WAL level SHOW wal_level; -- Default: replica (supports streaming replication) -- Options: minimal, replica, logical -- See where WAL files are stored SHOW data_directory; -- WAL files live in: {data_directory}/pg_wal/ -- Current WAL position (useful for replication monitoring) SELECT pg_current_wal_lsn(); -- Result: 0/16B3748 (Log Sequence Number) -- How much WAL has been generated? SELECT pg_size_pretty(pg_wal_lsn_diff( pg_current_wal_lsn(), '0/0' )) AS total_wal_generated;
-- A checkpoint flushes all dirty data from RAM to disk -- After a checkpoint, all WAL before that point can be recycled SHOW checkpoint_timeout; -- Default: 5min (checkpoint at least every 5 minutes) SHOW max_wal_size; -- Default: 1GB (force checkpoint if WAL exceeds this) -- Force a manual checkpoint (rarely needed) CHECKPOINT;
WAL is the foundation of three critical features: crash recovery (replay WAL after restart), replication (stream WAL to replicas), and point-in-time recovery (replay WAL up to a specific moment to undo a mistake). If you understand WAL, you understand how PostgreSQL stays reliable, how backups work, and how replication is even possible.
Imagine your diary contains every important memory of your life. What if your house catches fire? You’d want a photocopy stored somewhere safe. Database backups are exactly that — copies of your data stored somewhere safe so you can recover from disasters like hardware failure, accidental deletions, or ransomware attacks.
PostgreSQL gives you two types of backups: logical backups (like photocopying each page — produces SQL scripts) and physical backups (like cloning the entire diary including the binding — copies raw data files).
# Backup a single database to a SQL file pg_dump -U postgres -d mystore > mystore_backup.sql # Backup in custom format (compressed, supports selective restore) pg_dump -U postgres -d mystore -Fc -f mystore_backup.dump # Backup only specific tables pg_dump -U postgres -d mystore -t orders -t products -Fc -f orders_products.dump # Backup schema only (no data — just table definitions) pg_dump -U postgres -d mystore --schema-only -f mystore_schema.sql # Backup data only (no table definitions) pg_dump -U postgres -d mystore --data-only -f mystore_data.sql
| Format | Flag | Compressed? | Selective Restore? | Best For |
|---|---|---|---|---|
| Plain SQL | (default) | No | No | Simple, readable backups |
| Custom | -Fc | Yes | Yes | Production backups (recommended) |
| Directory | -Fd | Yes | Yes | Parallel backup/restore |
| Tar | -Ft | No | Yes | Archive-compatible format |
# Restore a plain SQL backup psql -U postgres -d mystore_new < mystore_backup.sql # Restore a custom-format backup pg_restore -U postgres -d mystore_new mystore_backup.dump # Restore only specific tables from a custom backup pg_restore -U postgres -d mystore_new -t orders mystore_backup.dump # Restore with --clean (drops objects before recreating) pg_restore -U postgres -d mystore_new --clean mystore_backup.dump # List contents of a backup without restoring pg_restore -l mystore_backup.dump
# Backup ALL databases, roles, and tablespaces pg_dumpall -U postgres > full_cluster_backup.sql # Backup only roles (users & permissions) pg_dumpall -U postgres --roles-only > roles_backup.sql
# Physical backup: copies the entire data directory # This is the foundation for PITR and setting up replicas pg_basebackup -U replicator -h localhost -D /backups/base_backup -Fp -Xs -P # Flags explained: # -D : destination directory # -Fp : plain format (raw files) # -Xs : stream WAL during backup # -P : show progress
#!/bin/bash # Daily backup script — add to crontab # crontab: 0 2 * * * /scripts/daily_backup.sh BACKUP_DIR="/backups/postgres" DATE=$(date +%Y%m%d_%H%M%S) DB_NAME="mystore" RETENTION_DAYS=7 # Create backup pg_dump -U postgres -d $DB_NAME -Fc -f "$BACKUP_DIR/${DB_NAME}_${DATE}.dump" # Delete backups older than retention period find $BACKUP_DIR -name "*.dump" -mtime +$RETENTION_DAYS -delete # Log the result echo "$(date): Backup completed — ${DB_NAME}_${DATE}.dump" >> /var/log/pg_backup.log
A backup that you’ve never tested restoring is NOT a backup. It’s a hope and a prayer. Regularly restore your backups to a test server to verify they actually work. Many teams have learned this the hard way — their “backup” files were corrupt or incomplete, and they only discovered this during an actual disaster.
This is your starting point — a full copy of the data directory.
Configure archive_mode = on and archive_command to copy WAL files to a safe location as they’re completed.
Don’t panic. You have a base backup and all the WAL files since that backup.
Set recovery_target_time = '2025-03-18 15:41:00' and PostgreSQL replays every change up to one minute before the disaster. Your data is back!
For production databases, use a 3-2-1 backup strategy: 3 copies of your data, on 2 different storage types, with 1 copy offsite (cloud storage, different data center). Combine daily pg_dump for quick table restores with continuous WAL archiving for point-in-time recovery.
Imagine you’re a teacher with 30 students and only one textbook. Everyone has to wait their turn to read it — very slow! But what if you made exact copies of the textbook and gave one to every table? Now everyone can read at the same time. That’s replication.
You have one primary (the original textbook) where all changes happen. Then you have one or more replicas (copies) that automatically stay in sync. Reads can go to any copy, but writes always go to the primary. This lets you handle 10x more read traffic without the primary breaking a sweat.
| Type | How It Works | Use Case |
|---|---|---|
| Streaming Replication | Primary streams WAL to replicas in real-time | High availability, read scaling |
| Logical Replication | Replicates specific tables using publish/subscribe | Selective sync, cross-version migration |
| Synchronous | Primary waits for replica to confirm write | Zero data loss (slower writes) |
| Asynchronous | Primary doesn’t wait (default) | Better performance (tiny lag OK) |
-- In postgresql.conf on the PRIMARY: wal_level = replica max_wal_senders = 5 -- how many replicas can connect wal_keep_size = '1GB' -- keep WAL around for slow replicas -- Create a replication user CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password_here';
# In pg_hba.conf on the PRIMARY: # TYPE DATABASE USER ADDRESS METHOD host replication replicator 192.168.1.0/24 scram-sha-256
# On the REPLICA server: pg_basebackup -h primary-server -U replicator -D /var/lib/postgresql/16/main -Fp -Xs -P -R # The -R flag creates standby.signal and sets primary_conninfo # This tells the replica "I'm a replica, connect to this primary"
# Start PostgreSQL on the replica sudo systemctl start postgresql # On the PRIMARY — check connected replicas: SELECT client_addr, state, sent_lsn, write_lsn, replay_lsn FROM pg_stat_replication;
-- On the PRIMARY: check how far behind each replica is SELECT client_addr, state, pg_size_pretty(pg_wal_lsn_diff(sent_lsn, replay_lsn)) AS replay_lag, NOW() - backend_start AS connection_duration FROM pg_stat_replication; -- On the REPLICA: check lag from replica's perspective SELECT NOW() - pg_last_xact_replay_timestamp() AS replication_lag;
-- On the PRIMARY: create a publication CREATE PUBLICATION orders_pub FOR TABLE orders, products; -- On the SUBSCRIBER (another PostgreSQL instance): CREATE SUBSCRIPTION orders_sub CONNECTION 'host=primary-server dbname=mystore user=replicator password=...' PUBLICATION orders_pub;
Read replicas for scaling: Your web app sends all writes to the primary, but reads (which are usually 80-90% of traffic) go to replicas. This is how Netflix, Instagram, and Uber handle billions of database queries — lots of read replicas behind a load balancer.
High availability: If the primary dies, a replica can be promoted to become the new primary in seconds. Tools like Patroni, repmgr, and pg_auto_failover automate this failover process.
In a hotel, you get a key card that opens only your room. You can’t open other guests’ rooms, the manager’s office, or the server room. The front desk (admin) can program any key card to open any door. Housekeeping gets a card that opens all rooms but can’t access the safe. That’s exactly how PostgreSQL roles and permissions work.
Each person (or application) gets a role with specific permissions. The principle is simple: give the minimum permissions needed (principle of least privilege). Your app doesn’t need to DROP tables in production, so don’t give it that power!
-- Create a role (user) with a password and login ability CREATE ROLE app_user WITH LOGIN PASSWORD 'strong_password_here'; -- Create a read-only role (no login — it's a group) CREATE ROLE readonly_group; -- Create a read-write role (group) CREATE ROLE readwrite_group; -- Create an admin role with CREATEDB ability CREATE ROLE db_admin WITH LOGIN PASSWORD 'admin_pass' CREATEDB;
-- Grant read-only access to the readonly_group GRANT CONNECT ON DATABASE mystore TO readonly_group; GRANT USAGE ON SCHEMA public TO readonly_group; GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly_group; -- Automatically grant SELECT on future tables too ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO readonly_group; -- Grant read-write access GRANT CONNECT ON DATABASE mystore TO readwrite_group; GRANT USAGE ON SCHEMA public TO readwrite_group; GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO readwrite_group; GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO readwrite_group; -- Add a user to a group GRANT readonly_group TO app_user; -- Revoke a permission REVOKE DELETE ON orders FROM readwrite_group;
| Role | Can SELECT? | Can INSERT/UPDATE? | Can DELETE? | Can DROP? |
|---|---|---|---|---|
readonly_group | ✔ | ✗ | ✗ | ✗ |
readwrite_group | ✔ | ✔ | ✔ | ✗ |
db_admin | ✔ | ✔ | ✔ | ✔ |
analytics_team | ✔ | ✗ | ✗ | ✗ |
-- RLS lets you control which ROWS a user can see, not just which tables -- Enable RLS on the orders table ALTER TABLE orders ENABLE ROW LEVEL SECURITY; -- Policy: users can only see their own orders CREATE POLICY user_sees_own_orders ON orders FOR SELECT USING (user_id = current_setting('app.current_user_id')::INTEGER); -- Policy: users can only insert orders for themselves CREATE POLICY user_inserts_own_orders ON orders FOR INSERT WITH CHECK (user_id = current_setting('app.current_user_id')::INTEGER); -- Admin policy: admins can see everything CREATE POLICY admin_full_access ON orders FOR ALL TO db_admin USING (TRUE);
If you use Supabase (the popular open-source Firebase alternative), you’re using RLS every day! Supabase’s security model is built entirely on PostgreSQL’s Row Level Security. When a user logs in and queries their data through the Supabase API, RLS policies ensure they can only see and modify their own data — even though they’re hitting the database directly.
# pg_hba.conf controls WHO can connect and HOW they authenticate # TYPE DATABASE USER ADDRESS METHOD # Local connections (Unix socket) — use peer auth local all postgres peer # Local network — password authentication host all all 127.0.0.1/32 scram-sha-256 host all all ::1/128 scram-sha-256 # Application servers on the internal network host mystore app_user 192.168.1.0/24 scram-sha-256 # Replication connections host replication replicator 192.168.1.0/24 scram-sha-256 # DANGER: never do this in production! # host all all 0.0.0.0/0 trust
postgres superuser for application connectionsscram-sha-256 authentication (not md5 or trust)ssl = on in postgresql.conf)Out of the box, PostgreSQL is configured for a small laptop — like driving a Ferrari in first gear. The default settings are extremely conservative to work on any machine. For production, you need to adjust the settings to match your server’s actual RAM, CPU, and storage. It’s like adjusting the seat, mirrors, and gear ratios for the actual driver and track.
| Setting | Default | What It Does | Recommended |
|---|---|---|---|
shared_buffers | 128MB | RAM for caching data pages | 25% of total RAM |
work_mem | 4MB | RAM per sort/hash operation | 50-256MB (depends on connections) |
maintenance_work_mem | 64MB | RAM for VACUUM, CREATE INDEX | 512MB – 2GB |
effective_cache_size | 4GB | Planner’s estimate of available cache | 50-75% of total RAM |
max_connections | 100 | Maximum concurrent connections | 100-300 (use pooling!) |
# postgresql.conf tuning for a 32GB RAM, SSD server # Memory shared_buffers = '8GB' # 25% of 32GB work_mem = '128MB' # per sort/hash operation maintenance_work_mem = '1GB' # for VACUUM and CREATE INDEX effective_cache_size = '24GB' # 75% of 32GB # Connections max_connections = 200 # use connection pooling! # WAL wal_buffers = '64MB' # WAL buffer size min_wal_size = '1GB' max_wal_size = '4GB' checkpoint_completion_target = 0.9 # spread checkpoint over 90% of interval # Planner random_page_cost = 1.1 # SSD (default 4.0 is for spinning disks) effective_io_concurrency = 200 # SSD can handle parallel IO # Logging log_min_duration_statement = 1000 # log queries slower than 1 second log_line_prefix = '%t [%p] %u@%d ' # timestamp, pid, user, database
Don’t guess! Use PGTune — a free online tool. Enter your server’s RAM, CPU count, storage type, and application type. It generates optimal postgresql.conf settings instantly. It’s what most PostgreSQL professionals use as a starting point.
-- Enable the extension (add to postgresql.conf first): -- shared_preload_libraries = 'pg_stat_statements' CREATE EXTENSION IF NOT EXISTS pg_stat_statements; -- Top 10 slowest queries (by total time) SELECT ROUND(total_exec_time::numeric, 2) AS total_ms, calls, ROUND(mean_exec_time::numeric, 2) AS avg_ms, ROUND(rows::numeric / NULLIF(calls, 0), 0) AS avg_rows, LEFT(query, 80) AS query_preview FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 10; -- Top 10 most frequently called queries SELECT calls, ROUND(mean_exec_time::numeric, 2) AS avg_ms, LEFT(query, 80) AS query_preview FROM pg_stat_statements ORDER BY calls DESC LIMIT 10; -- Reset statistics (do this periodically) SELECT pg_stat_statements_reset();
-- The problem: each PostgreSQL connection uses ~5-10MB of RAM -- 200 connections × 10MB = 2GB just for connections! -- The solution: PgBouncer sits between your app and PostgreSQL -- Your app opens 200 connections to PgBouncer -- PgBouncer uses only 20 connections to PostgreSQL -- Connections are reused and shared — like a carpool! # pgbouncer.ini (simplified) [databases] mystore = host=127.0.0.1 port=5432 dbname=mystore [pgbouncer] listen_port = 6432 listen_addr = 0.0.0.0 auth_type = scram-sha-256 pool_mode = transaction # release connection after each transaction default_pool_size = 20 # 20 actual PostgreSQL connections max_client_conn = 500 # 500 app connections allowed
Without PgBouncer, a spike to 500 simultaneous users means 500 PostgreSQL connections, eating up 5GB of RAM and causing the database to crawl. With PgBouncer in “transaction mode,” those 500 users share just 20-50 actual connections, because most connections are idle between queries. Amazon RDS, Supabase, and Railway all use connection pooling by default.
When you drive a car, you have a dashboard showing speed, fuel level, engine temperature, and warning lights. You don’t need to be a mechanic to notice when the temperature gauge is in the red zone. PostgreSQL has its own dashboard — system views that tell you what’s happening right now, what’s slow, and what needs attention.
-- See all active connections and what they're doing SELECT pid, usename AS user, datname AS database, state, NOW() - query_start AS duration, wait_event_type, LEFT(query, 60) AS query_preview FROM pg_stat_activity WHERE state != 'idle' ORDER BY query_start;
-- Queries running for more than 5 minutes SELECT pid, usename, NOW() - query_start AS duration, state, query FROM pg_stat_activity WHERE state = 'active' AND query_start < NOW() - INTERVAL '5 minutes' ORDER BY query_start; -- Kill a specific runaway query (use with caution!) SELECT pg_cancel_backend(12345); -- graceful: cancel the query SELECT pg_terminate_backend(12345); -- forceful: terminate the connection
-- Table health overview: reads, writes, dead rows, vacuums SELECT relname AS table_name, seq_scan AS full_scans, idx_scan AS index_scans, n_tup_ins AS inserts, n_tup_upd AS updates, n_tup_del AS deletes, n_dead_tup AS dead_rows, last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup DESC;
If a table shows far more sequential scans (full table scans) than index scans, it means queries are reading the entire table instead of using indexes. This is the #1 indicator that you need to add an index. For example, if orders has 500,000 seq_scans and only 1,000 idx_scans, something is very wrong.
-- Find blocked queries (waiting for locks) SELECT blocked.pid AS blocked_pid, blocked.usename AS blocked_user, LEFT(blocked.query, 50) AS blocked_query, blocking.pid AS blocking_pid, blocking.usename AS blocking_user, LEFT(blocking.query, 50) AS blocking_query FROM pg_stat_activity blocked JOIN pg_locks bl ON bl.pid = blocked.pid JOIN pg_locks lk ON lk.locktype = bl.locktype AND lk.database = bl.database AND lk.relation = bl.relation AND lk.pid != bl.pid JOIN pg_stat_activity blocking ON blocking.pid = lk.pid WHERE bl.granted = FALSE;
-- Total database size SELECT pg_size_pretty(pg_database_size('mystore')) AS db_size; -- Result: 2.4 GB -- Size of each table (including indexes and toast) SELECT relname AS table_name, pg_size_pretty(pg_total_relation_size(relid)) AS total_size, pg_size_pretty(pg_relation_size(relid)) AS table_size, pg_size_pretty(pg_indexes_size(relid)) AS index_size FROM pg_stat_user_tables ORDER BY pg_total_relation_size(relid) DESC; -- Unused indexes (candidates for removal) SELECT indexrelname AS index_name, relname AS table_name, idx_scan AS times_used, pg_size_pretty(pg_relation_size(indexrelid)) AS index_size FROM pg_stat_user_indexes WHERE idx_scan = 0 ORDER BY pg_relation_size(indexrelid) DESC;
Set up alerts for: (1) Connections approaching max_connections, (2) Queries running longer than 5 minutes, (3) Replication lag exceeding 1 minute, (4) Disk usage above 80%, (5) Dead rows exceeding 20% on any table. Tools like pgAdmin, Datadog, Grafana + Prometheus, and pg_stat_monitor make this easy.
Write a query using pg_stat_user_tables to find tables with more than 10% dead rows. Show table name, live rows, dead rows, and dead percentage.
SELECT relname AS table_name, n_live_tup AS live_rows, n_dead_tup AS dead_rows, ROUND( n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 2 ) AS dead_pct FROM pg_stat_user_tables WHERE n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) > 0.1 ORDER BY dead_pct DESC;
Write the pg_dump command to backup a database called shopdb in custom format to /backups/shopdb.dump. Then write the command to restore it to a new database called shopdb_restored.
# Backup pg_dump -U postgres -d shopdb -Fc -f /backups/shopdb.dump # Create the target database first createdb -U postgres shopdb_restored # Restore pg_restore -U postgres -d shopdb_restored /backups/shopdb.dump
Create a role called analyst that can only SELECT from all tables in the public schema. Create another role called api_service that can SELECT, INSERT, and UPDATE (but NOT DELETE).
-- Analyst role (read-only) CREATE ROLE analyst WITH LOGIN PASSWORD 'analyst_pass'; GRANT CONNECT ON DATABASE mystore TO analyst; GRANT USAGE ON SCHEMA public TO analyst; GRANT SELECT ON ALL TABLES IN SCHEMA public TO analyst; -- API service role (read-write, no delete) CREATE ROLE api_service WITH LOGIN PASSWORD 'api_pass'; GRANT CONNECT ON DATABASE mystore TO api_service; GRANT USAGE ON SCHEMA public TO api_service; GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA public TO api_service; GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO api_service;
Write a query using pg_stat_statements to find the top 5 queries by average execution time that have been called more than 100 times.
SELECT calls, ROUND(mean_exec_time::numeric, 2) AS avg_ms, ROUND(total_exec_time::numeric, 2) AS total_ms, LEFT(query, 100) AS query_preview FROM pg_stat_statements WHERE calls > 100 ORDER BY mean_exec_time DESC LIMIT 5;
shared_buffers typically be set to?Congratulations! You’ve gone from “What’s a database?” to understanding production-grade PostgreSQL. Here’s everything you’ve learned across all 13 modules:
| Module | What You Learned |
|---|---|
| 1. Meet PostgreSQL | What databases are, why PostgreSQL, who uses it |
| 2. Installation | Setting up PostgreSQL and DBeaver on any platform |
| 3. SQL Basics | CREATE, INSERT, SELECT, UPDATE, DELETE — the CRUD operations |
| 4. Data Types | INTEGER, VARCHAR, TIMESTAMP, BOOLEAN, constraints (PK, FK, UNIQUE, CHECK) |
| 5. Querying | WHERE, ORDER BY, LIMIT, LIKE, BETWEEN, IN, NULL handling |
| 6. Aggregations | COUNT, SUM, AVG, GROUP BY, HAVING |
| 7. JOINs | INNER, LEFT, RIGHT, FULL OUTER, CROSS, Self JOIN |
| 8. Subqueries & CTEs | Nested queries, WITH clause, recursive CTEs |
| 9. Indexes | B-tree, GIN, GiST, EXPLAIN ANALYZE, query optimization |
| 10. Functions | Built-in functions, CREATE FUNCTION, stored procedures, triggers |
| 11. Transactions | ACID, BEGIN/COMMIT/ROLLBACK, isolation levels, MVCC, deadlocks |
| 12. JSON & Arrays | JSONB operators, GIN indexes, arrays, hybrid data modeling |
| 13. Production | VACUUM, WAL, backup, replication, security, tuning, monitoring |
Practice: Build a real project — an e-commerce database, a blog platform, or a task management system. Nothing beats hands-on experience.
Advanced topics to explore: Window functions, partitioning, full-text search, PostGIS (geospatial), foreign data wrappers, pgvector (AI embeddings), and database design patterns.
Certifications: Consider the EDB PostgreSQL certification to validate your skills professionally.
Community: Join the PostgreSQL mailing lists, attend PGConf events, and contribute to the ecosystem. The PostgreSQL community is one of the most welcoming in all of tech.
Databases are the backbone of every application you use — from Instagram to banking to healthcare. By mastering PostgreSQL, you’ve learned a skill that will serve you throughout your entire career, no matter which programming language, framework, or cloud provider you work with. PostgreSQL isn’t going anywhere — it’s been around since 1986 and it’s only getting better. You’ve made an excellent investment in yourself. Now go build something amazing! 🚀