Skip to content

[Bug] New onSchemaChange statements get rate limited after 5 new columns #2025

@HHammond

Description

@HHammond

When using the new onSchemaChange feature the generated SQL adds each new column with a separate ALTER TABLE statement, which hits BigQuery's 5 alterations per 10 seconds rate limit.

Feature Announcement

Error:

Exceeded rate limits: too many table update operations for this table. For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas

Image

Minimal case to reproduce

config {
  schema: 'scratch',
  type: "incremental",
  onSchemaChange: "EXTEND",
  uniqueKey: "a"
}

${when(incremental(), 
  `SELECT 1 as a, 2 as b, 3 as c, 4 as d, 5 as e, 6 as f, 7 as g`,
  `SELECT 1 as a`
)}

Current Behaviour

Dataform maps all new column additions to separate ALTER TABLE statements:

  IF ARRAY_LENGTH(columns_added) > 0 THEN
    FOR new_column IN (SELECT * FROM UNNEST(columns_added) AS column_info)
    DO
      EXECUTE IMMEDIATE FORMAT(
            'ALTER TABLE `project.schema.table` ADD COLUMN %s %s',
              new_column.column_name, new_column.data_type
        );
    END FOR;
  END IF;

Expected Behaviour

Dataform should use a single ALTER TABLE ADD COLUMN statement to add multiple columns at once:

A single statement has the following benefits:

  • Atomic changes — either all fields are added or none are. Currently a single incorrect column configuration can leave a table in a partially updated state.
  • Better resource use — avoid hitting the 5 updates/10s rate limit

Concerns

It's not clear to me if this is a dataform core issue -- there is no logic in dataform core for the actual SQL generation, if this is the case let me know.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions