Skip to content

Releases: dataform-co/dataform

2.8.0: Packaging and compilation performance improvements

07 Dec 14:27
da51cfc
Compare
Choose a tag to compare

The highlights of this release are two significant performance improvements for GCP Dataform projects:

  • The @dataform/core NPM package now no longer has any dependencies, with all dependency packages bundled in to existing minified bundle. This reduces the size of installation significantly and should improve package installation performance and reliability on Dataform on GCP. See #1552 for more info.
  • We've rolled out a change that reduces the amount of work required to decode/encode compiled graphs that we expect to improve compilation performance on Dataform on GCP by ~2x. See #1570 for more info.

What's Changed

New Contributors

Full Changelog: 2.7.0...2.8.0

2.7.0: Updates for Dataform GCP incremental SQL

23 Oct 10:29
9434085
Compare
Choose a tag to compare

From version 2.7.0 onwards, Dataform projects running on Google Cloud Platform will use updated SQL generation logic for incremental insert tables (tables of type incremental without a uniqueKey specified).

Explicit column names

Column names will be explicitly listed in the insert call, which is inline with OSS Dataform behaviour and prevents schema mismatch during insert, for example:

  • if source_table has columns in a different order that the target - it can lead to data corruption (column values can be swapped during insert)
  • if source_table has different number of columns - the insert into fails since target table columns count does not match with the source

For the new script the incremental query has to list all target table columns (can list other extra columns, but at least must contain target table columns) in any order.

Example of new generated SQL:

INSERT INTO $target_table 
    ($target_columns_list) -- listing target columns
    SELECT target_columns_list -- reordering columns so that subquery column order matches the target column order
    FROM (
        $incrementalQuery
    );

Execution within a procedure

In order to facilitate explicit columns, the new code is executed within a procedure, which will be created on the fly. For example:

EXECUTE IMMEDIATE
"""
CREATE OR REPLACE PROCEDURE $procedure_name()
BEGIN
    $preOperations
    $incrementalInsertStatement
    $postOperations
END;
"""
CALL $procedure_name();
DROP PROCEDURE IF EXISTS $procedure_name;

2.6.8

18 Oct 15:11
c60b3c2
Compare
Choose a tag to compare

Fixing bug for escaping target name and canonical target name. PR: #1551

2.6.7: Support `NO_COLOR` environment variable

17 Aug 08:53
5d23f14
Compare
Choose a tag to compare

@dataform/cli now supports turning off coloured output via setting the NO_COLOR environment variable, per the standard detailed at https://no-color.org/.

2.6.6: Enforce tighter requirements around "automatic" importing of `includes` files

16 Aug 16:12
e76f542
Compare
Choose a tag to compare

We have fixed two bugs which inadvertently loosened our (expected) requirements around usage of "automatic" includes imports.

For context: in general, to reference the contents of a file in includes (specifically a file's module.exports object), the callsite should call require() on that file, e.g. const foo = require("includes/subdirectory/foo.js");.

Dataform simplifies this for "top-level" includes files, i.e. direct children of the includes directory, by automatically making these files available globally. For example, in order to use includes/foo.js, a callsite does not need to require("includes/foo.js"); instead, a foo object is made available to all Dataform code in the project.

Two bugs have been found and fixed:

  1. includes files can no longer implicitly depend on other includes files
  2. only top-level includes files are now automatically available globally

This unfortunately results in a potentially breaking change to some Dataform projects - but this will only happen upon upgrading @dataform/core to >= 2.6.6.

In order to fix any breakages, the calling file must be changed to explicitly require() the relevant includes file.

For example, in SQLX:

js {
  const whatever = require("includes/subdirectory/whatever.js");
}

Or in JavaScript:

const whatever = require("includes/whatever.js");

For more context, see https://issuetracker.google.com/issues/296162656#comment3.

2.6.5: Fix `@dataform/cli` for `@dataform/core` pre-`2.6.5`.

16 Aug 13:47
154b726
Compare
Choose a tag to compare

This version fixes behaviour in rare circumstances which was broken by @dataform/cli version 2.6.3. See https://issuetracker.google.com/issues/296162656#comment3 for details.

2.6.4: Fix `dataform compile` on Windows

15 Aug 13:51
a8279e1
Compare
Choose a tag to compare
Fix `@dataform/core` on Windows. (#1526)

* Fix `@dataform/core` on Windows.

* bump version

2.6.3: Bump `glob` dependency version.

15 Aug 13:33
a5ee673
Compare
Choose a tag to compare
Upgrade `glob`. (#1525)

2.6.2: Fix: triple-quotes strings getting removed after compilation

09 Aug 11:44
f053c33
Compare
Choose a tag to compare

The lexer was updated to have multiline string tokens (in 2.6.1), which gave us ability to format triple quoted strings (https://github.com/dataform-co/dataform/releases/tag/2.6.1). But in compilation these tokens were not being handled and thus they were getting removed after compilation. We have fixed this issue in this version.

2.6.1: Improve formatting for triply-quoted strings

28 Jul 08:38
9238001
Compare
Choose a tag to compare
  • improve SQLX formatting for triply-quoted strings
  • various dependency version upgrades

UPDATE: This version introduced a bug: triple-quoted strings are getting removed after compilation. The fix is in 2.6.2 (https://github.com/dataform-co/dataform/releases/tag/2.6.2)