Amazon Redshift Interview | Data For Dummies

Free sample — easy bait

What is Amazon Redshift and how does it differ from RDS? Normal

Redshift is AWS’s managed, petabyte-scale data warehouse for analytics. It uses columnar storage and Massively Parallel Processing (MPP): a leader node plans queries and compute nodes run them in parallel. It’s optimized for complex analytical queries over large data. RDS is for transactional (OLTP) workloads: row-based, ACID, single-node or read replicas. Use Redshift for BI, reporting, and data lakes; use RDS for applications and operational databases.

What is a distribution key (DISTKEY) and why does it matter? Normal

The distribution key determines which slice (and node) stores each row. Rows with the same key value are co-located. For JOINs, if both tables are distributed on the join column, the join can happen locally without moving data (collocated join). Options include KEY (hash on a column, typical for fact tables), EVEN (round-robin), ALL (copy to every node, for small dimension tables), and AUTO. Choosing the right DISTKEY is critical for query performance.

Unlock all Redshift interview questions

25+ questions across architecture, COPY, Spectrum, tuning, and design — with full answers.

Upgrade to Pro

Architecture & design