-
Notifications
You must be signed in to change notification settings - Fork 535
Description
Describe the bug
Using equal_rowcount
between 2 different tables with the same groupby columns results in a false positive test
Steps to reproduce
- Create 2 different tables, with the same group by column. Let's call this
gb
.gb
can have different values between the 2 tables (but same type). For example: 1 table could be (select 'x' as gb union all (select 'y' as gb)) and the other could be (select 'z' as gb) - Apply
equal_rowcount
between the 2 table usinggb
as agroup_by_columns
- Run the dbt data test
Expected results
This test should fail and users should be alerted
Actual results
This test passes without any problem
Screenshots and log output
Result of dbt_internal_test
if using above example
Because we have coalesce functions in the test evaluation, all null diff_count
result as 0 failure, while it should be something else
select
sum(coalesce(diff_count, 0)) as failures,
sum(coalesce(diff_count, 0)) != 0 as should_warn,
sum(coalesce(diff_count, 0)) != 0 as should_error
from dbt_internal_test
System information
The contents of your packages.yml
file:
packages:
- package: dbt-labs/dbt_utils
version: 1.3.0
Which database are you using dbt with?
- postgres
- redshift
- bigquery
- snowflake
- other (specify: ____________)
The output of dbt --version
:
I'm using dbt cloud so I suppose it's versionless
Additional context
I think it's line 63 of equal_rowcount.sql. Instead of abs(count_a - count_b) as diff_count
it should be abs(coalesce(count_a, 0) - coalesce(count_b, 0))
If this is applied, diff_count
would no longer be null and coalesce() is no longer needed inside the sum() functions above. It could be moved to outside the sum() function like #974 which resolves #973
Are you interested in contributing the fix?
Yes I can contribute