Skip to content

[Bug] where clause doesn't work on derived metrics #1781

@soham-dasgupta

Description

@soham-dasgupta

Is this a new bug in metricflow?

  • I believe this is a new bug in metricflow
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Specifying --where on derived metrics doesn't work as expected as the constructed query has the where filter applied before the metric is created.

Expected Behavior

In the constructed query where filter should be applied after the metric is created,
Notice how the constructed query has the where filter applied incorrectly.

Steps To Reproduce

semantic_model

semantic_models:
  - name: sem_perf
    defaults:
      agg_time_dimension: activity_date
    model: ref('perf')
    entities:
      - name: sp
        type: primary
        expr: '1'
      - name: region_mp_sku
        type: unique
        expr: region_mp_sku
    dimensions:
      - name: activity_date
        type: time
        expr: activity_date
        type_params:
          time_granularity: day
    measures:
      - name: pv_cost_lc
        expr: CASE WHEN channel = '3' THEN cost_lc ELSE 0 END
        agg: sum
        create_metric: true

metrics:
  - name: pv_cost_lc_category
    type: derived
    label: pv_cost_lc_category
    type_params:
      expr: |
              CASE 
                  WHEN pv_cost_lc = 0 THEN 'no_spend'
                  WHEN pv_cost_lc <= approx_percentile(CASE WHEN pv_cost_lc != 0 THEN pv_cost_lc END, 0.33) OVER () THEN 'low'
                  WHEN pv_cost_lc <= approx_percentile(CASE WHEN pv_cost_lc != 0 THEN pv_cost_lc END, 0.90) OVER () THEN 'medium'
                  ELSE 'high'
              END
      metrics:
        - name: pv_cost_lc

Query

mf query \
--metrics pv_cost_lc,pv_cost_lc_category \
--group-by region_mp_sku__activity_date \
--where "pv_cost_lc_category = 'medium'" \
--start-time '2025-07-01' \
--end-time '2025-07-14' \
--explain
SELECT
  COALESCE(subq_10.region_mp_sku__activity_date__day, subq_18.region_mp_sku__activity_date__day) AS region_mp_sku__activity_date__day
  , MAX(subq_10.pv_cost_lc) AS pv_cost_lc
  , MAX(subq_18.pv_cost_lc_category) AS pv_cost_lc_category
FROM (
  SELECT
    region_mp_sku__activity_date__day
    , SUM(pv_cost_lc) AS pv_cost_lc
  FROM (
    SELECT
      DATE_TRUNC('day', activity_date) AS region_mp_sku__activity_date__day
      , CASE WHEN channel = '3' THEN cost_lc ELSE 0 END AS pv_cost_lc
    FROM "trino"."dev"."perf"  sem_perf_src_10000
    WHERE DATE_TRUNC('day', activity_date) BETWEEN timestamp '2025-07-01' AND timestamp '2025-07-14'
  ) subq_7
  WHERE pv_cost_lc_category = 'medium'
  GROUP BY
    region_mp_sku__activity_date__day
) subq_10
FULL OUTER JOIN (
  SELECT
    region_mp_sku__activity_date__day
    , CASE 
        WHEN pv_cost_lc = 0 THEN 'no_spend'
        WHEN pv_cost_lc <= approx_percentile(CASE WHEN pv_cost_lc != 0 THEN pv_cost_lc END, 0.33) OVER () THEN 'low'
        WHEN pv_cost_lc <= approx_percentile(CASE WHEN pv_cost_lc != 0 THEN pv_cost_lc END, 0.90) OVER () THEN 'medium'
        ELSE 'high'
    END AS pv_cost_lc_category
  FROM (
    SELECT
      region_mp_sku__activity_date__day
      , SUM(pv_cost_lc) AS pv_cost_lc
    FROM (
      SELECT
        DATE_TRUNC('day', activity_date) AS region_mp_sku__activity_date__day
        , CASE WHEN channel = '3' THEN cost_lc ELSE 0 END AS pv_cost_lc
      FROM "trino"."dev"."perf" sem_perf_src_10000
      WHERE DATE_TRUNC('day', activity_date) BETWEEN timestamp '2025-07-01' AND timestamp '2025-07-14'
    ) subq_14
    WHERE pv_cost_lc_category = 'medium'
    GROUP BY
      region_mp_sku__activity_date__day
  ) subq_17
) subq_18
ON
  subq_10.region_mp_sku__activity_date__day = subq_18.region_mp_sku__activity_date__day
GROUP BY
  COALESCE(subq_10.region_mp_sku__activity_date__day, subq_18.region_mp_sku__activity_date__day)

Relevant log output

N/A

Environment

- OS: Mac
- Python:3.11.11
- dbt: 1.8.9 
- metricflow: 0.7.1

Which database are you using?

other (mention it in "Additional Context")

Additional Context

Trino

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageTasks that need to be triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions