-
Notifications
You must be signed in to change notification settings - Fork 114
Description
Is this a new bug in metricflow?
- I believe this is a new bug in metricflow
- I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
When executing a dimension-only query (no metrics) that requires traversing to dimensions through different foreign entity relationships, and those relationships are defined in separate semantic models, MetricFlow throws an UnableToSatisfyQueryError
during dataflow plan building.
The error occurs specifically when:
- Query has no metrics (dimension-only)
- Multiple dimensions require entity traversal
- The required entity relationships are defined in different semantic models
Example failing query:
Consider an entity job
that appears as the primary entity in two separate semantic models:
- In
fact_job_assignments
:job
(primary) has a foreign key relationship tocustomer
- In
fact_job_campaigns
:job
(primary) has a foreign key relationship tocampaign
When attempting to query dimensions from both relationships in a dimension-only query:
mf query --explain --metrics '' --group-by job,customer__customer_name,campaign__campaign_name
(Note: The alternative query with "fully qualified paths" mf query --explain --metrics '' --group-by job,job__customer__customer_name,job__campaign__campaign_name
describes the intention more directly, but this also does not work and produces a different error message: "The given input does not match any of the available group-by-items for a distinct values query without metrics.")
Expected Behavior
MetricFlow should successfully generate SQL for dimension-only queries even when the required entity traversals are defined across multiple semantic models. The query planner should be able to combine the entity relationships from different models to construct the necessary joins.
Steps To Reproduce
- Create a star schema with the following semantic models:
semantic_models:
- name: dim_customers
description: Customer dimension semantic model
model: ref('dim_customers')
entities:
- name: customer
type: primary
expr: customer_id
dimensions:
- name: customer_name
type: categorical
- name: customer_segment
type: categorical
- name: dim_campaigns
description: Campaign dimension semantic model
model: ref('dim_campaigns')
entities:
- name: campaign
type: primary
expr: campaign_id
dimensions:
- name: campaign_name
type: categorical
- name: campaign_type
type: categorical
- name: fact_job_assignments
description: Job assignments fact table semantic model
model: ref('fact_job_assignments')
defaults:
agg_time_dimension: assignment_date
entities:
- name: job
type: primary
expr: job_id
- name: customer
type: foreign
expr: customer_id
measures:
- name: assignment_count
agg: count
expr: 1
dimensions:
- name: assignment_date
type: time
type_params:
time_granularity: day
- name: fact_job_campaigns
description: Job campaigns fact table semantic model
model: ref('fact_job_campaigns')
defaults:
agg_time_dimension: participation_date
entities:
- name: job
type: primary
expr: job_id
- name: campaign
type: foreign
expr: campaign_id
measures:
- name: campaign_participation_count
agg: sum
expr: participation_count
dimensions:
- name: participation_date
type: time
type_params:
time_granularity: day
- Run dimension-only query requiring traversals from both fact tables:
mf query --explain --metrics '' --group-by job,customer__customer_name,campaign__campaign_name
- Observe the error
What DOES work with the same schema:
# Works: Query with a metric
mf query --metrics total_assignments --group-by job,customer__customer_name,campaign__campaign_name
# Works: Dimension-only within single fact
mf query --metrics '' --group-by job,customer__customer_name
# Works: Different dimension-only within single fact
mf query --metrics '' --group-by job,campaign__campaign_name
Only the dimension-only query requiring traversal across BOTH fact tables fails.
Relevant log output
2025-08-27 14:47:54,953 ERROR dataflow_plan_builder.py:1267 [MainThread] - No recipe could be constructed.
2025-08-27 14:47:54,954 ERROR utils.py:143 [MainThread] - Logging exception handled by the CLI exception handler
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/dbt_metricflow/cli/utils.py", line 140, in wrapper
func(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/telemetry/reporter.py", line 150, in wrapped
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/dbt_metricflow/cli/main.py", line 225, in query
explain_result = cfg.mf.explain(mf_request=mf_request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/telemetry/reporter.py", line 150, in wrapped
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/engine/metricflow_engine.py", line 549, in explain
return self._create_execution_plan(mf_request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/engine/metricflow_engine.py", line 522, in _create_execution_plan
dataflow_plan = self._dataflow_plan_builder.build_plan_for_distinct_values(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/dataflow/builder/dataflow_plan_builder.py", line 814, in build_plan_for_distinct_values
return self._build_plan_for_distinct_values(query_spec, optimizations=optimizations)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow_semantics/mf_logging/runtime.py", line 37, in _inner
result = wrapped_function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/metricflow/dataflow/builder/dataflow_plan_builder.py", line 849, in _build_plan_for_distinct_values
raise UnableToSatisfyQueryError(f"Unable to join all items in request: {required_linkable_specs}")
metricflow_semantics.errors.error_classes.UnableToSatisfyQueryError: Unable To Satisfy Query Error: Unable to join all items in request: LinkableSpecSet(dimension_specs=(DimensionSpec(element_name='customer_name', entity_links=(EntityReference(element_name='customer'),)), DimensionSpec(element_name='campaign_name', entity_links=(EntityReference(element_name='campaign'),))), time_dimension_specs=(), entity_specs=(EntitySpec(element_name='job', entity_links=()),), group_by_metric_specs=())
Environment
- OS: macOS 15.6
- Python: 3.11.9
- dbt: 1.10.5
- metricflow: Latest from main branch
Which database are you using?
other (mention it in "Additional Context")
Additional Context
This seems unrelated to the database being used; I observe it when using Snowflake and also when using duckdb for local testing.