Cloud Warehouse Weekly #6 Slowly Changing Dimensions: What They Are and Why They Still Matter
Understand SCD Types 1, 2, and 3 — and When to Use Each in Modern Cloud Warehousing
Hey friends 👋
Wherever you are in the world, thank you for being here.
This week, we’re exploring a foundational yet often misunderstood topic in data warehousing: slowly changing dimensions (SCDs).
If you’ve ever had to explain why a customer’s country changed in your dashboard or why last quarter’s revenue doesn’t match what you saw last month, this one’s for you.
Let’s break it down simply.
What Are Slowly Changing Dimensions?
A dimension is just a descriptive table. Think customer
, product
, region
. It gives meaning to your facts (like sales, revenue, or visits).
A slowly changing dimension changes over time but not rapidly. A customer might change their name, a product might be rebranded, or a salesperson might switch regions.
Now here’s the catch: do you want to preserve those changes or not? That’s what SCDs are all about.
If you always overwrite the latest value, you might break historical reports.
If you always preserve history, you might bloat your tables and confuse joins.
That’s why we have SCD strategies to help decide what to do when dimension data changes.
The Main SCD Types (The Ones You’ll Actually Use)
There are six SCD types out there, but in practice, you’ll mostly encounter these:
Type 1: Overwrite with No History
You update the record with the new value.
Old data is gone.
Great for: Fixing typos, correcting bad data, and things you never care to track historically.
Example: Updating "Jonh" to "John" in the customer_name
column.
Type 2: Add a New Record (Keep Full History)
You insert a new row for every change.
Add
effective_date
,expiry_date
, oris_current
flags.
Great for: Anything where historical analysis matters, sales attribution, compliance reports, customer churn.
Example: A customer moves from California to Texas. You want reports from last year to still show them in California.
Type 3: Keep Just the Previous Value
Add a column like
previous_region
orold_manager
.Limited to one historical version.
Great for: Rare edge cases where you only care about one-level change.
Example: HR data where you care about current and last job title, but nothing else.
Why It Matters (And Where It Bites You)
Data does not exist in a vacuum. Reports, models, and KPIs all depend on your dimensions being consistent or traceable.
Choose the wrong SCD strategy and:
Month-end dashboards won’t match
Data scientists train on stale or misaligned data
Executives lose trust in the warehouse
If you’ve ever been in a "data doesn’t match" meeting, chances are, an SCD problem was lurking.
How the Cloud Changed the Game
In the on-prem world, Type 2 was expensive. Lots of data, complex ETL, slow queries.
But cloud warehouses made it easier:
Storage is cheap
Compute is elastic
Time travel and zero-copy cloning make versioning safer
Tools like dbt made the Type 2 approachable. You can build dim_customer
models with versioning logic, use is_latest
flags, and control historical joins.
Suddenly, tracking changes doesn’t feel painful.
But here’s the catch: you still need to choose the right approach. The warehouse won’t do it for you.
A Simple Analogy
Think of a grocery store’s inventory system.
Type 1: You update the product price every time. No record of old prices.
Type 2: You keep every price change with effective dates.
Type 3: You store the current price and the last one. That’s it.
Different needs, different strategies.
TL;DR Cheat Sheet
Coming Next Week:
Data Modeling 101 - From Star Schema to ELT
✉️ Like these quick, no-BS breakdowns of data engineering concepts?
Subscribe to Cloud Warehouse Weekly, delivered fresh every Thursday. Easy to read. Hard to forget.
See you next week,
— Raja 🙌
Let me know how you handle slowly changing dimensions in your data warehouse.