Skip to content

Add option to ignore backfill of MV when redeploying a model #506

@dlouseiro

Description

@dlouseiro

Is your feature request related to a problem? Please describe.
In this PR, a new functionality was introduced to ignore backfilling a materialized view with past data (catchup option). While this feature is super handy, it only works for the first deployment of a given MV model.

So, if one does the following:

  1. Create an MV model (foo) that does not exist in clickhouse with catchup: False
  2. Deploy the foo model (with dbt run)
  3. One week after, edit model foo to add a new column (for example)
  4. Redeploy the model (using dbt run --select foo --full-refresh)

In the deployment of point 2, the catchup option is indeed respected and the target table foo is not backfilled. Although in the deployment of point 4, the table is backfilled.

This behaviour makes sense, as from point 2 onwards the model is available for end users and one might not want end users to lose access to one week of data, just because a new column was added.

Although, it can also be quite handy to have the possibility of always ignoring data backfills, even when the model is being redeployed, specially when the insert command needed to backfill the table is quite heavy and may affect the health of the clickhouse instance.

Describe the solution you'd like
Add a new option to also ignore backfills, even when a model is being redeployed, keeping the catchup option with the same behaviour.

So we'd have catchup option that ignores backfills on first deployment and catchup_on_full_refresh to ignore backfills also when a model is being redeployed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions