dbt (Data Build Tool) lets you write SQL that turns raw data into clean, tested tables โ and it runs inside your warehouse, tracks lineage, and generates documentation automatically. Think of it as a magical robot chef that follows your SQL recipes, checks quality, and keeps a perfect cookbook.
What is dbt, the restaurant kitchen analogy, ETL vs ELT, the modern data stack, and why you should learn dbt.
BEGINNERFrom Drew Banin's frustration to dbt Labs' $4.2B valuation. The full story of how dbt changed data engineering.
Install dbt Core or Cloud, connect to your warehouse, configure profiles.yml, and run your first dbt command.
BEGINNEREvery file and folder explained โ models/, macros/, seeds/, snapshots/, tests/, and the layered architecture.
Views, tables, incremental, ephemeral, and snapshots. The decision tree for choosing materializations. The 3-layer architecture.
COREsource() and ref() explained. The DAG (dependency graph). Building your first lineage. Cross-project refs.
Schema tests (unique, not_null, relationships), custom SQL tests, source freshness, and dbt_expectations.
INTERMEDIATEschema.yml, doc blocks, dbt docs generate, the lineage graph, column-level lineage, and hosting docs.
Jinja syntax, variables, filters, if/else, for loops, writing macros, and built-in dbt macros.
INTERMEDIATEdbt_utils, dbt_expectations, installing packages, CSV seeds, and when to use seeds vs sources.
dbt Cloud jobs, GitHub Actions, Airflow integration, environment variables, monitoring, and blue-green deployments.
ADVANCEDHooks, exposures, dbt Mesh, Semantic Layer, performance tuning, enterprise patterns, and career advice.
dbt init my_project
dbt debug
dbt deps
dbt seed
dbt run
dbt run --select model_name
dbt run --select +model_name
dbt run --select model_name+
dbt run --full-refresh
dbt test
dbt test --select model_name
dbt build
dbt snapshot
dbt docs generate
dbt docs serve
dbt source freshness
dbt compile
dbt clean
| Type | Stored? | Best For | Rebuild |
|---|---|---|---|
| View | No (query only) | Staging models, light transforms | Every query |
| Table | Yes (full copy) | Mart models, heavy aggregations | Full rebuild each run |
| Incremental | Yes (append/merge) | Large fact tables, event data | Only new/changed rows |
| Ephemeral | No (CTE only) | Intermediate calculations | Inlined into parent |
| Snapshot | Yes (historical) | Slowly changing dimensions | Tracks changes over time |
| Layer | Prefix | Purpose | Materialization |
|---|---|---|---|
| Staging | stg_ | Clean raw data (rename, cast, trim) | View |
| Intermediate | int_ | Business logic, joins, aggregations | View or Ephemeral |
| Marts | fct_ / dim_ | Final tables for dashboards & analysts | Table or Incremental |
| Test | What It Checks | Example Use |
|---|---|---|
| unique | No duplicate values | Primary keys |
| not_null | No NULL values | Required fields |
| accepted_values | Only allowed values | Status columns |
| relationships | Foreign key exists | order.customer_id โ customer.id |
docs.getdbt.com โ The complete dbt reference
getdbt.com/community โ 50,000+ members
hub.getdbt.com โ Browse community packages
courses.getdbt.com โ Free official courses