Releases: data-solution-automation-engine/TEAM
TEAM v1.6.5
Overview
TEAM 1.6.5 is a major revision. The key change is the removal of the (central) metadata repository, which meant having a dependency on SQL Server. In this new TEAM version, no database repository is required anymore.
Not requiring a central repository removes various issues around multiple project teams contributing to the design, potentially impacting and/or overwriting each other’s work.
Instead, TEAM now directly interacts (reads and writes) to the data warehouse automation compliant JSON files that it would produce in earlier versions. The collection of JSON files effectively is the repository, and downstream code generation access the same files.
This means that the existing ‘activate’ process has been removed, because there is no longer a need to convert design into JSON files. Similarly, the ‘version’ concept is removed. This is because the JSON files would be located in a Git repository which handles versioning of the design metadata, along with the snapshot of the physical models and data logistics (code generation) patterns.
The combination of the JSON files (design metadata), the physical model snapshot and the code generation patterns allow for ‘time travelling’ for the solution itself, enabling deterministic reproduction of the model version at a point in time.
Other changes include:
- Introduction of a ‘metadata’ path that captures where the JSON files are located
- Stronger separation of environments, allowing configuration of paths per environment
- Updated sample data
- Support for Multi-Factor Authentication to access Azure databases for reverse engineering of the physical model
- Ability to preview the JSON model in the app
- Ability to override JSON segments in the app, for example to add custom extensions
- Support for the ‘data query’ concept at both object and item level, e.g. the ability to define logic as a source instead of a table or column. Using `` will generate a dataQuery segment
- Updated metadata importer to load legacy tabular JSON table mappings and generate these directly as data warehouse automation schema JSON files
- Added physical model query generator to assist automating snapshot creation for the physical model
- The distinction between ‘virtual mode’ and ‘physical mode’ has been removed. The only mode is what used to be ‘virtual mode’, which means all validations run against the snapshot of the physical model, and this tab is always visible
- Multi-Active Satellites are now generated as extension for the target data item, so that these can be used in code generation patterns
- Addition of a ‘purge’ feature to (re)apply generic settings to all mappings / JSON files
Breaking changes
Since the paths are now stored against the environment, the app will report an error the first time it is opened with older configurations. The resolution is to either remove the files in the ‘core’ directory (they will be recreated) or update the paths in the app and save
The way the JSON files are created for dimension- and fact target data objects has changed. Previously, each mapping would contain multiple source data objects. Now, each data object is in a separate mapping. This means presentation layer patterns need to be updated. Changes as part of VDW 1.6.7 will address this.
TEAM v1.6.4
Bug fix support release, no new functionality.
Main focus is addressing reported bugs, primarily around the way the JSON output conforms to the schema for data solution automation.
- Ensuring data types are correctly created, when enabled in export settings
- Metadata validation fixes
- Minor exception handling improvements
TEAM v1.6.3.1
Support update to assist ongoing projects, no new major functionality.
Changes:
- Updated links to data model
- Fixed reference data targets
- Added validation for incomplete link
- Various small bugs
TEAM v1.6.3
The new 1.6.3 version of the Taxonomy of ETL Automation Metadata (TEAM) contains further bug fixes, additional validation and ease-of-use improvements based on project feedback.
The result hopefully a new step in the right direction for this mapping management component of the ecosystem for Data Warehouse Automation.
A notable change that may have impact on existing implementations is the (improved) handling of prefixes and suffixes. This means that underscores '_' are now not automatically added. Existing keys and prefixes may need updating as a result. For example the key prefix SK or HSH may now need to be _SK or _HSH is the underscore needs to be retained. This is related to issue #73.
Main changes for this release:
- Added basic Data Vault validation
- Hiding of features not used when in physical model to clean up interface
- Added Json export features, including 'next-up' object in the lineage as relatedDataObject(s) and other related objects such as metadata connections.
- Removed the repository feature. This is meant to be ultimately deprecated and was only causing problems between versions for users. This is now always deployed as part of metadata activation.
Details on issues addressed are found in the (now closed) project for v1.6.3: https://github.com/RoelantVos/TEAM/projects/2.
TEAM v1.6.2
Bug fix release since v1.6.2, addressing various issues and suggestions from different projects. This is also the release where big strides have been made (in the background) to move away from requiring a database repository. Having said this, this version requires a repository upgrade (repository screen/deploy metadata repository). The repository does not contain information that needs to be retained, as all settings and metadata is stored in Json files. So, this can be safely done without losing work.
The main functional fix is improvements around schema naming, and reusing schema settings in the connections.
- Added connection test feature on connection tabs.
- Added initial Presentation Layer samples.
- Added various tooltips for clarification and explanation.
- Aligned DIRECT examples with VDW examples, so only one set is required (and removed options associated with this).
- Better event logging and reading/writing to central log.
- Schema validation and naming checks across various grids.
The full list of changes can be found here: https://github.com/RoelantVos/TEAM/projects/1.
TEAM v1.6.1
This new update on the Taxonomy of ETL Automation Metadata (TEAM) is meant to work with the Data Warehouse Automation schema v1.2, and therefore fully compatible with the Virtual Data Warehouse (VDW) version 1.6.2 upwards.
The bottom line is that, at this stage, the most recent versions of TEAM and VDW use the same version of the Data Warehouse Automation schema. TEAM can prepare the metadata, which is then saved in the format conform the Data Warehouse Automation schema. VDW can interpret these files and apply a variety of patterns against the metadata to generate code.
A very large number of underlying improvements in look & feel, exception handling and efficiency have been implemented in this version. It's a big release.
From a functional point of view these are the key changes:
- Json files can be exported for v1.2 of the Data Warehouse Automation schema.
- A new form has been created to configure the various Json segments one would like to add to the output.
- Separate connection management screens, with the ability to assign connections to source- and target metadata objects. This also means you can split up functional areas across connections, such as having a PSA defined across multiple technical environments.
- Context menus to create Json files and DDL statements from the various grid views.
- Event log to track what is happening.
- Enabled multiple environments to be defined, replacing the legacy dev/prod switch. You can now create any amount of environments and associated connections dynamically, and switch between then.
- Support for versioning across files, and maintenance of versions across Json files. It is now easier to switch versions.
- Enable / disable flag at Data Object level to only process certain parts of metadata.
- Initial support for transformations, adding hard-coded values in attribute mappings (Data Item Mappings)
TEAM v.1.6.0
The most recent release (downloadable in the corresponding Github releases section links) of TEAM is now version 1.6.0, and for Virtual Data Warehouse this is 1.6.1.
These updates have fully decoupled the management of the (source-to-target) mapping metadata with the code generation, making it easier to use other tools for some of the functions if desired and by virtue of this support a bigger ecosystem for open source Data Warehouse Automation.
TEAM now saves the design metadata as Json files that conform to the generic schema for Data Warehouse automation. The Virtual Data Warehouse tool can now be configured to read all (Json) files from a designated directory and apply the templates (patterns) to these files using the provided templating engine.
This means it is possible to use TEAM without VDW and vice-versa. It is also possible to incorporate the schema validation functionality available in the Data Warehouse Automation class library to make sure all files are conform the standard. Last, but not least it makes it easier to create your own patterns using the available metadata without software constraints.
The TEAM software is still geared towards Data Vault use-cases, but due to the ability to tweak the patterns without code changes it can now be better adapted for other applications also. For example, this release was used to generate an Persistent Staging Area for Azure Data Factory!
Because of the pattern engine, VDW is completely agnostic of design approaches.
This approach also makes it easier to integrate the metadata into CI/CD pipelines and version control, because the key design artefacts are now all text based – making it easier to commit changes (differentials).
The TEAM metadata remain the key artefacts to version control because these contain the true design Intellectual Property explaining which data elements go where – mapping and lineage. However, the Json files that are created as part of the ‘activation’ routine can be consumed in a DevOps pipeline for code generation.
Alternatively, complex output from the pattern code generation can be written back into the Json structure as (source) transformations.
Since the Json files conform to the generic schema for Data Warehouse Automation they can also be manually tweaked or even created. This may be useful in case there is no interest in using TEAM, but still intent to use the VDW code generation and patterns, or if custom complex transformation need to be added.
In addition to the above there is a long list of minor usability improvements including, but not limited to:
- Easier interaction with the data grids for data entry. Various context menus have been created to simplify adding, exporting and removing rows.
- Improved validation mechanisms that perform pre-checks before any generation work is started.
- Masking of passwords in the software (note that passwords are still visible in the text files).
- Better exception handling, logging and reporting.
- Usage of external files for many operations, making the tools easier to configure and upgrade. For example – many SQL statements, patterns and lists are now stored in script and Json files and are installed as part of the software.
- Better handling of schemas and tables, for example tables with the same name but different schemas do not cause errors anymore.
- Usability improvements in the GUI, for example updating values when paths change etc.
TEAM 1.5.5.2
Major revision and alignment to VEDW v1.6.
- Overall improved of error handling and reporting. Meaningful error messages are now reported back if things go wrong, and the error log works at all times, including when creating the repository and sample data.
- Underlying metadata model upgrade, the full model is available here. This is the v.1.6 version of the repository model.
- Support for source-to-staging and PSA mappings.
- Updated colour scheme to use softer palette, and included STG, PSA and DIM coding.
- Fixed some bugs in supporting schemas in source or target names. I.e. you can now use [].[] everywhere.
- Removed unused functionality everywhere in the tool. All visible options work :-).
- Removed generate sample data. This is now a VEDW option.
- Changed the 'ignore version' checkbox to a radio button which enables the selection of either the 'Virtual Mode' or 'Physical Mode'. Virtual Mode basically uses the internal physical model metadata to generate the Virtual Data Warehouse, whereas the Physical Mode performs target database (catalog / data dictionary) lookups to collect database metadata.
- Enhanced activation mechanisms to drive more on JSON and the on-screen metadata than use database logic. The logic still needs a database, but it's significantly less. Future versions are envisaged to be completely database-less.
- Sample data now also includes physical model metadata to demonstrate 'Virtual Mode'.
- Removed Graph menu item, this has been moved to a dedicated TEAM_Graph Github. The reason is that this removes the limit of 20 people following the yWorks license, while still enabling utilising yWorks.
TEAM 1.5.5.1
Minor bug fixes related to production tests
- String trimming (for variables)
- Code cleanup
No new functionality added.
TEAM 1.5.5.0
Changes in this version:
- Updated repository to 1. 5 (https://app.sqldbm.com/SQLServer/Share/xdwwuNcV1SLytr8SsGu4iUGFrngIE8md_DYjF4jNYw0).
- Across-the-board name changes for repository tables and attributes (i.e. ATTRIBUTE_FROM_NAME => SATELLITE_ATTRIBUTE_NAME.
- Addition of LOAD_VECTOR to capture the inferred direction of ETL (to support connection changes). I.e. Data Vault to Data Vault or source to Data Vault.
- Snapshot of physical model is created during activation (MD_PHYSICAL_MODEL) and exposed as an interface (INTERFACE_PHYSICAL_MODEL).
- Various cosmetic changes (auto-resize, removing versioning where not required).
- Added support for saving interface outputs to disk as JSON files.
- Refactor of the activation logic, to reduce code duplication
- Added support for schema and database in the table names (fully qualified names). For instance adding bdv.SAT_CUSTOMER_DERIVED.
- Various underlying changes to enable true virtualisation based on the reverse-engineered physical model. If ignore version is unchecked then everything properly runs off the internal physical model (grid).
- Minor changes to sample data, to resolve the record source when generating sample code for DIRECT.
- Fixed some issues around case-sensitive databases. The metadata should be case-sensitive, but was accidentally enclosed with an UPPER statement.
- Added INTERFACE_SOURCE_LINK_ATTRIBUTE_XREF interface view to expose degenerate attributes for links.