Skip to content

Functional Dependency Test Macro #989

@edejong-dbc

Description

@edejong-dbc

Describe the feature

It would be valuable to have a functional_dependency generic test in dbt-utils that ensures one or more "determinant" columns uniquely determine another "dependent" column. The test should fail if any determinant combination maps to more than one distinct dependent value. An optional where clause would allow users to scope the test to a subset of records.

This is also especially helpful for hierarchical relationships. For instance:

  • If region → country, ensuring each region only has one corresponding country.
  • If (country, state) → city, ensuring each (country, state) pair only maps to one city.
  • Ensuring each (parent_id, child_id) pair in a hierarchy leads to exactly one “child name” or other dependent attributes.

Describe alternatives you've considered

  • Manual SQL tests: Users currently write custom SQL queries or macros to check for duplicate mappings, but this adds boilerplate and inconsistency across projects.
  • Existing uniqueness tests: While dbt-utils has unique_combination_of_columns, it doesn’t directly verify that those columns determine the value of a separate column.

Additional context

  • This feature is database-agnostic since it primarily involves standard SQL grouping and counting.
  • The test should align with existing dbt-utils generic test conventions, including adapter dispatch and lower-case SQL syntax.

Who will this benefit?

  • Data modelers needing to validate referential integrity, e.g. ensuring (order_id, product_id) always maps to a unique quantity.
  • Teams maintaining dimensional models who want to confirm a dimension’s natural key always leads to a single record.
  • Organizations checking hierarchical or parent-child relationships, where each parent or parent-child pair must map to only one child or child property.

Are you interested in contributing this feature?

Yes, we already have a working macro and example seeds demonstrating passing and failing scenarios. We’d be happy to raise a PR and contribute the code, along with documentation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions