-
Notifications
You must be signed in to change notification settings - Fork 188
Open
Description
When using the new onSchemaChange
feature the generated SQL adds each new column with a separate ALTER TABLE
statement, which hits BigQuery's 5 alterations per 10 seconds rate limit.
Error:
Exceeded rate limits: too many table update operations for this table. For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas

Minimal case to reproduce
config {
schema: 'scratch',
type: "incremental",
onSchemaChange: "EXTEND",
uniqueKey: "a"
}
${when(incremental(),
`SELECT 1 as a, 2 as b, 3 as c, 4 as d, 5 as e, 6 as f, 7 as g`,
`SELECT 1 as a`
)}
Current Behaviour
Dataform maps all new column additions to separate ALTER TABLE
statements:
IF ARRAY_LENGTH(columns_added) > 0 THEN
FOR new_column IN (SELECT * FROM UNNEST(columns_added) AS column_info)
DO
EXECUTE IMMEDIATE FORMAT(
'ALTER TABLE `project.schema.table` ADD COLUMN %s %s',
new_column.column_name, new_column.data_type
);
END FOR;
END IF;
Expected Behaviour
Dataform should use a single ALTER TABLE ADD COLUMN
statement to add multiple columns at once:
A single statement has the following benefits:
- Atomic changes — either all fields are added or none are. Currently a single incorrect column configuration can leave a table in a partially updated state.
- Better resource use — avoid hitting the 5 updates/10s rate limit
Concerns
It's not clear to me if this is a dataform core issue -- there is no logic in dataform core for the actual SQL generation, if this is the case let me know.
Metadata
Metadata
Assignees
Labels
No labels