Skip to content

[Bug] Query time explodes as number of dimensions and metrics increase #1827

@dstein-st

Description

@dstein-st

Is this a new bug in metricflow?

  • I believe this is a new bug in metricflow
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When I use mf query with 70 metrics and 100 dimensions, it basically doesn't terminate (runs for more than 10 minutes). This happens whether or not I use --explain indicating it is the generation of the SQL that is taking forever, not the running of the SQL query on my DB.

Background: In my application, I have a group of around 100 dimensions (split over several entities) and about 70 metrics, which need to be usable together in any configuration. I've taken steps to make sure that any of the metrics can be queried with any of the other metrics in this group, and also with any of the dimensions. In order to validate that they are all interoperable, I construct a single mf query --explain with all of the metrics and all of the dimensions. The idea is: if this explain-query succeeds, it means that mf could also construct any smaller query with some of the metrics/dimensions removed. But this isn't working because mf is taking forever to try to generate the SQL query.

Interestingly: When I run mf query --explain with all 70 of the metrics and none of the dimensions (except for group-by one of the main entities), it succeeds quickly. And when I run it with all 100 of the dimensions and none of the metrics, that succeeds quickly too. But put them together and the running time explodes.

Expected Behavior

Query planning time should not explode as the number of metrics and dimensions increase.

Steps To Reproduce

Create a test environment with 7 entities split across 25 semantic models, with 100 dimensions, and 70 metrics (simple and derived). Query all of them at the same time via mf query --explain.

Relevant log output

Environment

- OS: Mac
- Python: 3.11
- dbt: 1.9.9
- metricflow: 0.8.2

Which database are you using?

snowflake

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageTasks that need to be triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions