Skip to content

Commit 2316276

Browse files
author
Alex Higgs
committed
Merge branch 'master' into releases
2 parents 2d4d566 + bf99075 commit 2316276

25 files changed

+862
-567
lines changed

CONTRIBUTING.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,5 +32,5 @@ please feel free to submit ideas and thoughts!
3232
Create a post with as much detail as possible; We'll be happy to reply and work with you.
3333

3434
## Pull requests
35-
If you've developed something which we can add via a pull request, we'd prefer that you submit an issue first
36-
so that we can discuss the changes.
35+
If you've developed something which we can add via a pull request, we're more than happy to consider it, but we'd
36+
like to discuss the changes first.

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
1-
**CURRENTLY IN PRE-RELEASE, WE ARE CONTINUALLY ADDING FEATURES AND IMPROVING DOCUMENTATION**
2-
31
<p align="center">
42
<img src="https://user-images.githubusercontent.com/25080503/65772647-89525700-e132-11e9-80ff-12ad30a25466.png">
53
</p>
64

75
latest [![Documentation Status](https://readthedocs.org/projects/dbtvault/badge/?version=latest)](https://dbtvault.readthedocs.io/en/latest/?badge=latest)
86

9-
stable [![Documentation Status](https://readthedocs.org/projects/dbtvault/badge/?version=v0.2.4-pre)](https://dbtvault.readthedocs.io/en/v0.2.4-pre/?badge=v0.2.4-pre)
7+
stable [![Documentation Status](https://readthedocs.org/projects/dbtvault/badge/?version=v0.3-pre)](https://dbtvault.readthedocs.io/en/v0.3-pre/?badge=v0.3-pre)
8+
9+
[past docs versions](https://dbtvault.readthedocs.io/en/latest/changelog/)
1010

1111
# dbtvault by [Datavault](https://www.data-vault.co.uk)
1212

@@ -34,7 +34,7 @@ Add the following to your ```packages.yml```
3434
packages:
3535

3636
- git: "https://github.com/Datavault-UK/dbtvault"
37-
revision: v0.2.4-pre # Latest stable version
37+
revision: v0.3-pre # Latest stable version
3838
```
3939
4040
And run

dbt_project.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
name: 'dbtvault'
2-
version: '0.2.1'
2+
version: '0.3'
33

44
profile: 'dbtvault'
55

docs/assets/images/staging.png

118 KB
Loading

docs/bestpractices.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
We advise you follow these best practises when using dbtvault.
2+
3+
## Staging
4+
5+
Currently, we are only supporting one load date per load, as per the [prerequisites](gettingstarted.md#prerequisites).
6+
7+
Until a future release solves this limitation, we suggest that if the raw staging layer has a mix of load dates,
8+
create a view on it and filter by the load date column to ensure only a single load date value is present.
9+
10+
After you have done this, follow the below steps:
11+
12+
- Add a reference to the view in your [sources](gettingstarted.md#setting-up-sources).
13+
- Provide the source reference to the view as the source parameter in the [from](macros.md#from)
14+
macro when building your [staging](staging.md) model .
15+
16+
For the next load you then can re-create the view with a different load date and run dbt again, or alternatively
17+
manage a 'water-level' table which tracks the last load date for each source, and is incremented each load cycle.
18+
Do a join to the table to soft-select the next load date.
19+
20+
## Source
21+
22+
We suggest you use a code. This can be anything that makes sense for your particular context, though usually an
23+
integer or alpha-numeric value works well. The code is often used to look-up the full table name in a table.
24+
25+
You may do this with dbtvault by providing the code as a constant in the [staging](staging.md) layer,
26+
using the [add_columns](macros.md#add_columns) macro. The [staging page](staging.md) presents this exact
27+
use-case in the code examples.
28+
29+
If there is already a source in the raw staging layer, you may keep this or override it;
30+
[add_columns](macros.md#add_columns) can do either.
31+
32+
## Hashing
33+
34+
Best practises for hashing include:
35+
36+
- Alpha sorting hashdiff columns. dbtvault does this for us, so no worries! Refer to the [multi-hash](macros.md#multi_hash) docs for how to do this
37+
38+
- Ensure all **hub** columns used to calculate a primary key hash are presented in the same order across all
39+
staging tables
40+
41+
!!! note
42+
Some tables may use different column names for primary key components, so we cannot sort the columns for
43+
you as we do with hashdiffs.
44+
45+
- For **links**, columns must be sorted by the primary key of the hub and arranged alphabetically by the hub name.
46+
The order must also be the same as each hub.

docs/changelog.md

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,26 @@ All notable changes to this project will be documented in this file.
44
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
55
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
66

7+
## [v0.3-pre] - 2019-10-24
8+
[![Documentation Status](https://readthedocs.org/projects/dbtvault/badge/?version=v0.3-pre)](https://dbtvault.readthedocs.io/en/v0.3-pre/?badge=v0.3-pre)
9+
10+
### Improvements
11+
12+
- We've removed the need to specify full mappings in the ```tgt``` metadata when creating table models.
13+
Users may now provide a table reference instead, as a shorthand way to keep the column name
14+
and date type the same as the source, [read the docs](macros.md#using-a-source-reference-for-the-target-metadata) for more details.
15+
The option to provide a mapping is still available.
16+
17+
- The check for whether a load is a union load or not is now more reliable.
18+
19+
### Documentation
20+
21+
- Updated code samples and explanations according to new functionality
22+
- Added a best practises page
23+
- Various clarifications added and errors fixed
24+
725
## [v0.2.4-pre] - 2019-10-17
26+
[![Documentation Status](https://readthedocs.org/projects/dbtvault/badge/?version=v0.2.4-pre)](https://dbtvault.readthedocs.io/en/v0.2.4-pre/?badge=v0.2.4-pre)
827

928
### Bug Fixes
1029

@@ -13,6 +32,7 @@ causing subsequent loads after the initial load, to fail.
1332

1433

1534
## [v0.2.3-pre] - 2019-10-08
35+
[![Documentation Status](https://readthedocs.org/projects/dbtvault/badge/?version=v0.2.3-pre)](https://dbtvault.readthedocs.io/en/v0.2.3-pre/?badge=v0.2.3-pre)
1636

1737
### Macros
1838

@@ -27,6 +47,7 @@ causing subsequent loads after the initial load, to fail.
2747
- Updated [hash](macros.md#hash) and [multi-hash](macros.md#multi_hash) according to new changes.
2848

2949
## [v0.2.2-pre] - 2019-10-08
50+
[![Documentation Status](https://readthedocs.org/projects/dbtvault/badge/?version=v0.2.2-pre)](https://dbtvault.readthedocs.io/en/v0.2.2-pre/?badge=v0.2.2-pre)
3051

3152
### Documentation
3253

@@ -36,6 +57,7 @@ causing subsequent loads after the initial load, to fail.
3657
- Renamed ```stg_orders_hashed``` back to ```stg_customers_hashed```
3758

3859
## [v0.2.1-pre] - 2019-10-07
60+
[![Documentation Status](https://readthedocs.org/projects/dbtvault/badge/?version=v0.2.1-pre)](https://dbtvault.readthedocs.io/en/v0.2.1-pre/?badge=v0.2.1-pre)
3961

4062
### Documentation
4163

@@ -45,6 +67,7 @@ causing subsequent loads after the initial load, to fail.
4567
- Corrected version in dbt_project.yml
4668

4769
## [v0.2-pre] - 2019-10-07
70+
[![Documentation Status](https://readthedocs.org/projects/dbtvault/badge/?version=v0.2-pre)](https://dbtvault.readthedocs.io/en/v0.2-pre/?badge=v0.2-pre)
4871

4972
[Feedback is welcome!](https://github.com/Datavault-UK/dbtvault/issues)
5073

@@ -76,8 +99,11 @@ the new and improved features.
7699
per best practises.
77100

78101
## [v0.1-pre] - 2019-09 / 2019-10
102+
[![Documentation Status](https://readthedocs.org/projects/dbtvault/badge/?version=v0.1-pre)](https://dbtvault.readthedocs.io/en/v0.1-pre/?badge=v0.1-pre)
103+
79104
### Added
80105

106+
81107
- Table Macros:
82108
- [Hub](macros.md#hub_template)
83109
- [Link](macros.md#link_template)
@@ -95,4 +121,4 @@ the new and improved features.
95121

96122
### Documentation
97123

98-
- Numerous changes leading up to Version 1.0 release
124+
- Numerous changes for version 0.1 release

docs/contributing.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,5 +32,5 @@ please feel free to submit ideas and thoughts!
3232
Create a post with as much detail as possible; We'll be happy to reply and work with you.
3333

3434
## Pull requests
35-
If you've developed something which we can add via a pull request, we'd prefer that you submit an issue first
36-
so that we can discuss the changes.
35+
If you've developed something which we can add via a pull request, we're more than happy to consider it, but we'd
36+
like to discuss the changes first.

docs/demonstration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
## Coming soon
22

3-
We will soon be making available a downloadable example project running dbtvault with the Snowflake TPCH dataset.
3+
Soon, we will be making available a downloadable example project running dbtvault with Snowflake's TPCH dataset.
44
This will showcase dbtvault with pre-written models, giving you further understanding of how it all works.
55

66
[Sign up](https://www.data-vault.co.uk/dbtvault/) and get notified when this is available!

docs/gettingstarted.md

Lines changed: 11 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -24,21 +24,22 @@ Happy Data Vaulting! :smile:
2424

2525
5. We assume you already have a raw staging layer.
2626

27-
6. Our macros assume that you are only loading from one set of load dates in a single load cycle (i.e. Your staging layer
27+
6. Our macros assume that you are only loading from one set of load dates in a single load cycle (i.e. your staging layer
2828
contains data for one ```load_datetime``` value only). **We will be removing this restriction in future releases**.
2929

30+
7. You should read our [best practices](bestpractices.md) guidance.
31+
3032
## Setting up sources
3133

3234
We will be using the ```source``` feature of dbt extensively throughout the documentation to make access to source
33-
data much easier, cleaner and more modular. The main advantage of this is that sources will be included in
34-
dbt dependency graphs
35+
data much easier, cleaner and more modular. The main advantage of this is that sources are then included in
36+
dbt dependency graphs.
3537

3638
We have provided an example below which shows a configuration similar to that used for the examples in our documentation,
37-
however this feature is documented extensively in dbts own documentation,
38-
so please [read here](https://docs.getdbt.com/docs/using-sources).
39+
however this feature is documented extensively in [the documentation for dbt itself](https://docs.getdbt.com/docs/using-sources).
3940

40-
After reading the above documentation, we recommend you place the ```schema.yml``` file you create for your sources,
41-
in the root of your ```models``` folder, however you can place it where needed for your specific project.
41+
After reading the above documentation, we recommend that you place the ```schema.yml``` file you create for your sources,
42+
in the root of your ```models``` folder, however you can place it where needed for your specific project and models.
4243

4344
```schema.yml```
4445

@@ -50,8 +51,8 @@ sources:
5051
database: MYDATABASE
5152
schema: MYSCHEMA
5253
tables:
53-
- name: stg_customer
54-
identifier: table_1
54+
- name: stg_customer # alias
55+
identifier: stg_customer_hashed # table name
5556
- name: ...
5657
```
5758
@@ -69,15 +70,4 @@ packages:
6970
And run
7071
```dbt deps```
7172

72-
[Read more on package installation (from dbt)](https://docs.getdbt.com/docs/package-management)
73-
74-
75-
## Final note before we start
76-
77-
The documentation is written in the context of a simple example, showing a step by step progression towards
78-
loading a Data Vault 2.0 Data Warehouse. We have documented everything you need to know, but as all use cases will vary,
79-
you will need to adapt this to your own needs and requirements.
80-
81-
If you need any more detail or require specific guidance, do not hesitate to
82-
[submit an issue](https://github.com/Datavault-UK/dbtvault/issues).
83-
We may be able to improve the package based on your feedback, and this will benefit the whole community!
73+
[Read more on package installation (from dbt)](https://docs.getdbt.com/docs/package-management)

0 commit comments

Comments
 (0)