Skip to content

[Bug] MetricFlow fails to join dimensions requiring entity traversal when relationships are defined in multiple models #1824

@dstein-st

Description

@dstein-st

Is this a new bug in metricflow?

  • I believe this is a new bug in metricflow
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When executing a dimension-only query (no metrics) that requires traversing to dimensions through different foreign entity relationships, and those relationships are defined in separate semantic models, MetricFlow throws an UnableToSatisfyQueryError during dataflow plan building.

The error occurs specifically when:

  1. Query has no metrics (dimension-only)
  2. Multiple dimensions require entity traversal
  3. The required entity relationships are defined in different semantic models

Example failing query:

Consider an entity job that appears as the primary entity in two separate semantic models:

  • In fact_job_assignments: job (primary) has a foreign key relationship to customer
  • In fact_job_campaigns: job (primary) has a foreign key relationship to campaign

When attempting to query dimensions from both relationships in a dimension-only query:

mf query --explain --metrics '' --group-by job,customer__customer_name,campaign__campaign_name

(Note: The alternative query with "fully qualified paths" mf query --explain --metrics '' --group-by job,job__customer__customer_name,job__campaign__campaign_name describes the intention more directly, but this also does not work and produces a different error message: "The given input does not match any of the available group-by-items for a distinct values query without metrics.")

Expected Behavior

MetricFlow should successfully generate SQL for dimension-only queries even when the required entity traversals are defined across multiple semantic models. The query planner should be able to combine the entity relationships from different models to construct the necessary joins.

Steps To Reproduce

  1. Create a star schema with the following semantic models:
semantic_models:
  - name: dim_customers
    description: Customer dimension semantic model
    model: ref('dim_customers')
    entities:
      - name: customer
        type: primary
        expr: customer_id
    dimensions:
      - name: customer_name
        type: categorical
      - name: customer_segment
        type: categorical
        
  - name: dim_campaigns
    description: Campaign dimension semantic model
    model: ref('dim_campaigns')
    entities:
      - name: campaign
        type: primary
        expr: campaign_id
    dimensions:
      - name: campaign_name
        type: categorical
      - name: campaign_type
        type: categorical
        
  - name: fact_job_assignments
    description: Job assignments fact table semantic model
    model: ref('fact_job_assignments')
    defaults:
      agg_time_dimension: assignment_date
    entities:
      - name: job
        type: primary
        expr: job_id
      - name: customer
        type: foreign
        expr: customer_id
    measures:
      - name: assignment_count
        agg: count
        expr: 1
    dimensions:
      - name: assignment_date
        type: time
        type_params:
          time_granularity: day
          
  - name: fact_job_campaigns
    description: Job campaigns fact table semantic model
    model: ref('fact_job_campaigns')
    defaults:
      agg_time_dimension: participation_date
    entities:
      - name: job
        type: primary
        expr: job_id
      - name: campaign
        type: foreign
        expr: campaign_id
    measures:
      - name: campaign_participation_count
        agg: sum
        expr: participation_count
    dimensions:
      - name: participation_date
        type: time
        type_params:
          time_granularity: day
  1. Run dimension-only query requiring traversals from both fact tables:
mf query --explain --metrics '' --group-by job,customer__customer_name,campaign__campaign_name
  1. Observe the error

What DOES work with the same schema:

# Works: Query with a metric
mf query --metrics total_assignments --group-by job,customer__customer_name,campaign__campaign_name

# Works: Dimension-only within single fact
mf query --metrics '' --group-by job,customer__customer_name

# Works: Different dimension-only within single fact  
mf query --metrics '' --group-by job,campaign__campaign_name

Only the dimension-only query requiring traversal across BOTH fact tables fails.

Relevant log output

2025-08-27 14:47:54,953 ERROR dataflow_plan_builder.py:1267 [MainThread] - No recipe could be constructed.
2025-08-27 14:47:54,954 ERROR utils.py:143 [MainThread] - Logging exception handled by the CLI exception handler
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/dbt_metricflow/cli/utils.py", line 140, in wrapper
    func(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/telemetry/reporter.py", line 150, in wrapped
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/dbt_metricflow/cli/main.py", line 225, in query
    explain_result = cfg.mf.explain(mf_request=mf_request)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/telemetry/reporter.py", line 150, in wrapped
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/engine/metricflow_engine.py", line 549, in explain
    return self._create_execution_plan(mf_request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/engine/metricflow_engine.py", line 522, in _create_execution_plan
    dataflow_plan = self._dataflow_plan_builder.build_plan_for_distinct_values(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/dataflow/builder/dataflow_plan_builder.py", line 814, in build_plan_for_distinct_values
    return self._build_plan_for_distinct_values(query_spec, optimizations=optimizations)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow_semantics/mf_logging/runtime.py", line 37, in _inner
    result = wrapped_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/dataflow/builder/dataflow_plan_builder.py", line 849, in _build_plan_for_distinct_values
    raise UnableToSatisfyQueryError(f"Unable to join all items in request: {required_linkable_specs}")
metricflow_semantics.errors.error_classes.UnableToSatisfyQueryError: Unable To Satisfy Query Error: Unable to join all items in request: LinkableSpecSet(dimension_specs=(DimensionSpec(element_name='customer_name', entity_links=(EntityReference(element_name='customer'),)), DimensionSpec(element_name='campaign_name', entity_links=(EntityReference(element_name='campaign'),))), time_dimension_specs=(), entity_specs=(EntitySpec(element_name='job', entity_links=()),), group_by_metric_specs=())

Environment

- OS: macOS 15.6
- Python: 3.11.9
- dbt: 1.10.5
- metricflow: Latest from main branch

Which database are you using?

other (mention it in "Additional Context")

Additional Context

This seems unrelated to the database being used; I observe it when using Snowflake and also when using duckdb for local testing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinglinear

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions