How to - use incremental models to detect regressions in historic metrics #1615
graciegoheen
announced in
Archive
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Background
Imagine that you have expected outputs for a historic metric (total revenue) as described below:
How can you test that these historical metrics do not change in dbt?
This is an alternative solution to the snapshot-based one proposed here.
Step 1
Let's say that you have a
fct_orders
table, which has one row for each order:First, you should create a model that sums the amount for each year excluding the current one (for simplicity, we're assuming you have no costs). Let's call this
fct_revenue_summary
:Step 2
Next, create an incremental model on top of
fct_revenue_summary
which captures the historical view of revenue outputs. Let's call thisfct_revenue_summary_history
:Step 3
Finally, create a test on
fct_revenue_summary_history
to check that each year has a single source of truth fortotal_revenue
:When you run a
dbt build
, you will get an error if you ever output a newtotal_revenue
value for a historic year that differs from the original. For example, if your originalfct_revenue_summary_history
looks like this:But then, you introduce a breaking change such that dbt now calculates the
total_revenue
for 2019 as 0.8 million,fct_revenue_summary_history
will now look like this:And the uniqueness test on the year column will fail.
This allows you to detect regressions in historic metrics!
Beta Was this translation helpful? Give feedback.
All reactions