The bank transfer problem — and how PostgreSQL keeps your data safe when 1000 users hit the database at once.
Imagine you want to send $500 from your savings account to your friend’s account. The bank needs to do two things:
Now here’s the scary question: what if the power goes out after step 1 but before step 2? Your money was taken out but never arrived! $500 just vanished into thin air! 😱
A transaction prevents this nightmare. It bundles both steps into a single "all-or-nothing" package. Either both steps succeed, or neither happens. If the power goes out mid-transfer, PostgreSQL automatically reverses step 1. Your money is safe.
Wedding vows: "Do you take this person to be your spouse?" Both people must say "I do." If one says no, the wedding is off — nobody is married. Both must commit, or it’s rolled back.
Online shopping checkout: The system must (1) charge your credit card, (2) reduce stock, and (3) create the order. If the credit card is declined at step 1, it must NOT reduce stock or create an order. All three steps are one transaction.
-- DANGEROUS! Two separate statements with no transaction UPDATE accounts SET balance = balance - 500 WHERE id = 1; -- ⚡ POWER OUTAGE HERE ⚡ UPDATE accounts SET balance = balance + 500 WHERE id = 2; -- This second statement never runs. $500 is LOST.
BEGIN; UPDATE accounts SET balance = balance - 500 WHERE id = 1; UPDATE accounts SET balance = balance + 500 WHERE id = 2; COMMIT; -- If ANYTHING fails between BEGIN and COMMIT, ALL changes are undone automatically.
Think of BEGIN as picking up a pencil instead of a pen. Everything you write (SQL statements) between BEGIN and COMMIT is written in pencil — it can be erased. When you say COMMIT, you trace over everything with a permanent marker — now it’s saved forever. If you say ROLLBACK, you grab an eraser and wipe everything clean, as if you never wrote anything.
-- Start a transaction BEGIN; -- Do some work INSERT INTO orders (customer_id, amount) VALUES (42, 299.99); UPDATE inventory SET stock = stock - 1 WHERE product_id = 101; -- Everything looks good? Save it! COMMIT;
BEGIN; UPDATE employees SET salary = 0; -- OH NO! I forgot the WHERE clause! Everyone's salary is now 0! ROLLBACK; -- Phew! Nothing happened. All salaries are back to normal.
By default, PostgreSQL runs in auto-commit mode. This means every individual SQL statement is automatically wrapped in its own invisible BEGIN...COMMIT. So when you type UPDATE employees SET salary = 0; without an explicit BEGIN, it's immediately committed — no take-backs! Always use explicit transactions for dangerous operations.
In a video game, you hit save points along the way. If you die at the boss fight, you don’t restart the entire game — you go back to the last save point. SAVEPOINT works the same way: it creates a checkpoint within your transaction so you can roll back to that specific point without losing all your work.
BEGIN; INSERT INTO orders (customer_id, amount) VALUES (1, 100); -- ✔ Order created successfully SAVEPOINT before_discount; UPDATE orders SET amount = amount * 0.5 WHERE customer_id = 1; -- Hmm, 50% discount is too much. Let's undo just THIS part. ROLLBACK TO before_discount; -- The order INSERT is still intact! Only the discount was undone. UPDATE orders SET amount = amount * 0.9 WHERE customer_id = 1; -- 10% discount instead. Much better. COMMIT; -- Both the order and the 10% discount are now permanent.
SAVEPOINTs are incredibly useful in application code. If a batch of 100 operations mostly succeeds but one fails, you can roll back just the failed one and continue processing the rest — instead of throwing away all 99 successful operations.
ACID is like the safety standards for a database, just like how a car has seatbelts, airbags, ABS brakes, and a roll cage. Each letter stands for a different type of protection for your data. Without ACID, using a database would be like driving a car with no brakes on a highway — terrifying and guaranteed to end badly.
"Atom" comes from the Greek word meaning "indivisible." An atomic transaction is indivisible — it either happens completely or not at all. There’s no in-between. Like a light switch: it’s either ON or OFF, never stuck halfway.
Bank transfer example: Deducting $500 from account A and adding $500 to account B is ONE atomic operation. If the deduction succeeds but the addition fails, the deduction is automatically reversed.
A transaction takes the database from one valid state to another valid state. If your database has a rule saying "account balance can never be negative," then no transaction can ever leave an account with -$100 — even if the code tries. The transaction will be rejected.
Constraints like NOT NULL, UNIQUE, CHECK, and FOREIGN KEY are the rulebook. Consistency means these rules are enforced before, during, and after every transaction.
Imagine 30 students taking an exam at the same time. Each student works independently — they can’t see each other’s answer sheets. Even though everyone is working simultaneously, each student’s experience is as if they’re the only one in the room. That’s isolation.
Database version: If User A is in the middle of a bank transfer (deducted from account A but hasn’t yet added to account B), User B querying the database should NOT see the intermediate state where $500 is missing from both accounts.
Once you COMMIT a transaction, it’s like getting a tattoo — it’s permanent. Even if the power goes out one millisecond after you press COMMIT, when the server comes back up, your data is still there. PostgreSQL writes committed data to the WAL (Write-Ahead Log) on disk before confirming the commit, so it survives crashes.
Imagine if Amazon’s database was not ACID-compliant: you could be charged for an order that was never created (no Atomicity), your account balance could go negative (no Consistency), you could see someone else’s half-processed order (no Isolation), or a confirmed order could vanish after a server restart (no Durability). ACID is what makes databases trustworthy.
Think of hotel room privacy. At the cheapest level, housekeeping barges in anytime (you see everyone’s mess). At the most expensive level, you have a private villa with soundproof walls (total isolation). PostgreSQL lets you choose how much "privacy" (isolation) your transaction gets. More privacy = safer but potentially slower.
| Level | Dirty Reads? | Non-Repeatable Reads? | Phantom Reads? |
|---|---|---|---|
| Read Uncommitted | Yes | Yes | Yes |
| Read Committed ★ | No | Yes | Yes |
| Repeatable Read | No | No | Yes* |
| Serializable | No | No | No |
★ = PostgreSQL default. *PostgreSQL’s Repeatable Read actually prevents phantom reads too (it’s stricter than the SQL standard requires).
Transaction A changes a salary to $100,000 but hasn’t committed yet. Transaction B reads $100,000. Then Transaction A does a ROLLBACK. Transaction B now has data that never actually existed. It’s like reading a draft of someone’s email before they decided not to send it.
PostgreSQL never allows dirty reads — even at Read Uncommitted level, PostgreSQL silently upgrades it to Read Committed.
Transaction B reads Alice’s salary: $85,000. Then Transaction A commits a raise to $92,000. Transaction B reads Alice’s salary again: $92,000. The same query returned different results within the same transaction! It’s like checking the price of a flight, going to the bathroom, coming back, and the price changed.
Transaction B counts employees in Engineering: 4 people. Then Transaction A inserts a new engineer and commits. Transaction B counts again: 5 people. A "phantom" row appeared! It’s like counting people in a room, turning around, and suddenly there’s one more person.
-- Set for the current transaction only BEGIN; SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; -- ... your queries here ... COMMIT; -- Set for the entire session SET default_transaction_isolation = 'serializable';
| Level | Best For | Trade-off |
|---|---|---|
| Read Committed | Most applications (default) | Good balance of safety and performance |
| Repeatable Read | Reports that need consistent data throughout | May get serialization errors on conflicts |
| Serializable | Financial transactions, inventory systems | Slowest — more serialization errors |
At Repeatable Read and Serializable levels, PostgreSQL may throw ERROR: could not serialize access if two transactions conflict. Your application must be ready to catch this error and retry the transaction. This is normal and expected — it’s how PostgreSQL maintains isolation without locking everything.
Imagine a magic notebook where every time someone changes a word, the old version doesn’t disappear — instead, a new version is written on a fresh page. Different people reading the notebook at different times see different versions, depending on when they started reading. Nobody has to wait for someone else to finish writing because they’re looking at their own snapshot of the notebook.
That’s Multi-Version Concurrency Control (MVCC). PostgreSQL keeps multiple versions of each row. Readers see a consistent snapshot of the data without blocking writers, and writers don’t block readers.
This is PostgreSQL’s superpower. In many other databases (like MySQL with MyISAM), a writer must lock the table, making all readers wait. PostgreSQL doesn’t have this problem.
Every row in PostgreSQL secretly has two hidden columns:
| Hidden Column | Meaning |
|---|---|
xmin | The transaction ID that created (inserted) this row version |
xmax | The transaction ID that deleted or updated this row version (0 if still alive) |
-- You can actually SEE these hidden columns! SELECT xmin, xmax, id, name, salary FROM employees LIMIT 3;
| xmin | xmax | id | name | salary |
|---|---|---|---|---|
| 100 | 0 | 1 | Alice Johnson | 85000 |
| 100 | 0 | 2 | Bob Smith | 62000 |
| 105 | 0 | 3 | Carol Davis | 91000 |
PostgreSQL doesn’t overwrite the old row. Instead, it:
Sets xmax on the old row to the current transaction ID.
Inserts a new physical row with the updated values and a new xmin.
Any transaction that started before the update continues to see the old row. New transactions see the updated row.
Imagine a reporting query that takes 5 minutes to run. Without MVCC, every INSERT/UPDATE/DELETE during those 5 minutes would either (a) be blocked, killing your application’s performance, or (b) produce inconsistent results where some rows are from before the changes and some from after. With MVCC, the report sees a consistent snapshot from the moment it started, while other users continue writing freely.
Since PostgreSQL keeps old row versions around, they accumulate over time. VACUUM is the cleanup crew that removes expired row versions that no active transaction can see anymore.
-- Manual vacuum (rarely needed — autovacuum handles it) VACUUM employees; -- Vacuum with extra info VACUUM VERBOSE employees; -- VACUUM FULL reclaims disk space (locks the table — use sparingly!) VACUUM FULL employees;
PostgreSQL runs autovacuum in the background by default. It monitors tables and vacuums them when dead tuple counts get high. In 99% of cases, you don’t need to run VACUUM manually. If you’re experiencing bloat, tune autovacuum settings — don’t disable it!
Even in the most freedom-loving household, the bathroom has a lock. When you’re using it, nobody else can come in — they have to wait. Database locks work the same way: sometimes you NEED exclusive access to a resource to prevent conflicts. But just like bathroom locks, you should hold them for the shortest time possible so others aren’t waiting forever.
-- SELECT FOR UPDATE: Lock specific rows for modification BEGIN; SELECT * FROM products WHERE id = 42 FOR UPDATE; -- This row is now LOCKED. Nobody else can UPDATE or DELETE it -- until we COMMIT or ROLLBACK. -- Check stock and process the order safely UPDATE products SET stock = stock - 1 WHERE id = 42; COMMIT; -- Lock released!
| Lock Type | What It Does | When to Use |
|---|---|---|
FOR UPDATE | Blocks other UPDATEs and DELETEs on the row | When you're going to modify the row |
FOR NO KEY UPDATE | Like FOR UPDATE but allows foreign key checks | Most UPDATE scenarios |
FOR SHARE | Blocks UPDATEs/DELETEs but allows other FOR SHARE | When you need to ensure the row doesn’t change while you read it |
FOR KEY SHARE | Lightest lock — only blocks DELETE and key changes | Foreign key references |
-- Job queue pattern: grab the next available task, skip locked ones BEGIN; SELECT * FROM tasks WHERE status = 'pending' ORDER BY created_at LIMIT 1 FOR UPDATE SKIP LOCKED; -- If the first pending task is locked by another worker, skip it -- and grab the next one. No waiting! UPDATE tasks SET status = 'processing' WHERE id = 123; COMMIT;
-- Explicitly lock an entire table (very rare) BEGIN; LOCK TABLE accounts IN EXCLUSIVE MODE; -- Nobody else can write to accounts until we COMMIT -- Use extremely sparingly! COMMIT;
-- Advisory locks are custom, application-defined locks -- They don't lock any table or row — they lock a number/key -- Acquire an advisory lock (blocks until available) SELECT pg_advisory_lock(12345); -- Do critical work that only one process should do at a time SELECT pg_advisory_unlock(12345); -- Try to acquire without blocking (returns true/false) SELECT pg_try_advisory_lock(12345); -- Returns: true (got the lock) or false (someone else has it)
Imagine you have a daily report that takes 10 minutes to generate. Multiple app servers might try to run it at the same time. Use an advisory lock so only ONE server generates the report. Others check the lock, see it's taken, and skip. No wasted computation!
Imagine two people in a narrow hallway, walking toward each other. Person A says "I won’t move until you move." Person B says "I won’t move until YOU move." Neither moves. They’re stuck forever. That’s a deadlock.
In database terms: Transaction A holds a lock on Row 1 and waits for Row 2. Transaction B holds a lock on Row 2 and waits for Row 1. Each is waiting for the other to finish. Neither can proceed.
-- Transaction A (running in one session) BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = 1; -- Locks row 1 -- ... pause ... UPDATE accounts SET balance = balance + 100 WHERE id = 2; -- WAITS for row 2! -- Transaction B (running in another session at the same time) BEGIN; UPDATE accounts SET balance = balance - 50 WHERE id = 2; -- Locks row 2 -- ... pause ... UPDATE accounts SET balance = balance + 50 WHERE id = 1; -- WAITS for row 1! -- DEADLOCK! A waits for B, B waits for A. -- PostgreSQL detects this within ~1 second and kills one transaction: -- ERROR: deadlock detected
PostgreSQL automatically detects deadlocks (default check interval: 1 second). When it finds one, it kills the younger transaction (the one that has done less work) and lets the older one proceed. The killed transaction gets an error that your application should catch and retry.
The #1 rule: always lock resources in the same order. If every transaction that needs accounts 1 and 2 always locks account 1 first, then account 2, deadlocks are impossible. This is called consistent lock ordering.
-- SAFE: Always lock the lower ID first CREATE PROCEDURE safe_transfer( from_id INTEGER, to_id INTEGER, amount NUMERIC ) LANGUAGE plpgsql AS $$ DECLARE first_id INTEGER := LEAST(from_id, to_id); second_id INTEGER := GREATEST(from_id, to_id); BEGIN -- Lock both rows in a consistent order (lowest ID first) PERFORM * FROM accounts WHERE id IN (first_id, second_id) ORDER BY id FOR UPDATE; UPDATE accounts SET balance = balance - amount WHERE id = from_id; UPDATE accounts SET balance = balance + amount WHERE id = to_id; END; $$;
| Tip | Why It Helps |
|---|---|
| Keep transactions short | Less time holding locks = less chance of conflict |
| Lock in consistent order | Prevents circular wait (the root cause of deadlocks) |
Use NOWAIT | Fail immediately instead of waiting — SELECT ... FOR UPDATE NOWAIT |
| Avoid locking in user-facing paths | Don't hold locks while waiting for user input |
| Use retry logic in your app | If a deadlock kills your transaction, just try again |
Now that you understand the theory, let’s look at patterns that professional developers use every day. These are like cooking recipes that experienced chefs swear by — they’ve been tested in real production systems handling millions of transactions.
"Insert this row. If it already exists, update it instead." This is incredibly common.
-- Insert a new user, or update their email if they already exist INSERT INTO users (username, email, last_login) VALUES ('john_doe', 'john@example.com', NOW()) ON CONFLICT (username) DO UPDATE SET email = EXCLUDED.email, last_login = EXCLUDED.last_login;
EXCLUDED refers to the row that was proposed for insertion but couldn’t be inserted because of a conflict. So EXCLUDED.email is the email from the VALUES clause. It’s like saying: "Use the new values I was trying to insert."
An operation is idempotent if running it multiple times produces the same result as running it once. This is critical when retrying failed transactions.
-- BAD: Not idempotent — each run adds another $100 UPDATE accounts SET balance = balance + 100 WHERE id = 1; -- GOOD: Idempotent — uses a unique transaction reference INSERT INTO transactions (id, account_id, amount) VALUES ('txn-abc-123', 1, 100) ON CONFLICT (id) DO NOTHING; -- If this transaction was already processed, it's silently skipped
-- Application-side retry pattern (Python-like pseudocode) -- max_retries = 3 -- for attempt in range(max_retries): -- try: -- BEGIN -- ... do work ... -- COMMIT -- break # Success! Exit the loop -- except SerializationError: -- ROLLBACK -- sleep(random(0.01, 0.1)) # Brief random delay -- continue # Try again
-- Check stock AND decrement in one atomic query UPDATE products SET stock = stock - 1 WHERE id = 42 AND stock > 0 RETURNING id, name, stock; -- If stock was already 0, no rows are updated -- RETURNING tells you what happened — empty = out of stock
RETURNING is a PostgreSQL extension that returns the affected rows after an INSERT, UPDATE, or DELETE. It’s like doing the operation and a SELECT in one statement. Extremely useful to confirm what changed without a second query.
BEGIN; -- Process each order in a batch, skipping failures SAVEPOINT order_1; INSERT INTO orders VALUES (1, 100, NOW()); -- success! SAVEPOINT order_2; INSERT INTO orders VALUES (2, -50, NOW()); -- fails! (maybe CHECK constraint on positive amount) ROLLBACK TO order_2; SAVEPOINT order_3; INSERT INTO orders VALUES (3, 200, NOW()); -- success! COMMIT; -- Orders 1 and 3 are saved. Order 2 was rolled back.
Write a transaction that transfers $250 from account 1 to account 3. Make sure both updates are in a single transaction. Add a SAVEPOINT before the second update.
BEGIN; UPDATE accounts SET balance = balance - 250 WHERE id = 1; SAVEPOINT before_credit; UPDATE accounts SET balance = balance + 250 WHERE id = 3; COMMIT;
Write an INSERT that adds a product with id=1, name='Laptop', price=999. If id 1 already exists, update only the price to 999.
INSERT INTO products (id, name, price) VALUES (1, 'Laptop', 999) ON CONFLICT (id) DO UPDATE SET price = EXCLUDED.price;
Write a query that decreases the stock of product 42 by 1, but ONLY if stock is greater than 0. Use RETURNING to see the new stock.
UPDATE products SET stock = stock - 1 WHERE id = 42 AND stock > 0 RETURNING id, name, stock;
Write the SQL to (a) check your current isolation level, and (b) start a transaction at SERIALIZABLE level.
-- (a) Check current level SHOW transaction_isolation; -- (b) Start a SERIALIZABLE transaction BEGIN; SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; -- ... your queries ... COMMIT;
UPDATE employees SET salary = 0; inside a BEGIN block and then run ROLLBACK?ON CONFLICT DO UPDATE do?You now understand how PostgreSQL keeps data safe under the most challenging conditions. Here’s a recap:
| Topic | Key Takeaway |
|---|---|
| Transactions | BEGIN...COMMIT bundles operations into an all-or-nothing unit |
| ROLLBACK & SAVEPOINT | Undo everything or roll back to a specific checkpoint |
| ACID | Atomicity + Consistency + Isolation + Durability = trustworthy data |
| Isolation Levels | Read Committed (default), Repeatable Read, Serializable — trade-off between safety and speed |
| MVCC | PostgreSQL’s multi-version system: readers and writers never block each other |
| Locks | FOR UPDATE, SKIP LOCKED, advisory locks — explicit control when needed |
| Deadlocks | Circular waits; prevent with consistent lock ordering |
| UPSERT & Patterns | ON CONFLICT, idempotent ops, retry logic, safe stock decrement |
PostgreSQL is unique among relational databases because it has first-class support for JSON and array data types. In the next module, you’ll learn how to store, query, and manipulate JSON documents right inside PostgreSQL — getting the best of both relational AND document database worlds. Think of it as PostgreSQL’s answer to MongoDB, but with ACID transactions and SQL power!