From a7bfee103f767c50d3b47143c5533c74bdb99f14 Mon Sep 17 00:00:00 2001 From: Diorge Brognara Date: Mon, 5 Feb 2024 11:45:19 -0300 Subject: [PATCH 1/7] docs(elementary_tests): columns draft Added empty descriptions for existing models under the elementary_tests documentation. --- models/elementary_tests.yml | 232 ++++++++++++++++++++++++++++++++++++ 1 file changed, 232 insertions(+) diff --git a/models/elementary_tests.yml b/models/elementary_tests.yml index d94d7933b..50ffc1d62 100644 --- a/models/elementary_tests.yml +++ b/models/elementary_tests.yml @@ -7,21 +7,225 @@ models: This incremental table is used to store the metrics over time. On each anomaly detection test, the test queries this table for historical metrics, and compares to the latest values. The table is updated with new metrics on the on-run-end named handle_test_results that is executed at the end of dbt test invocations. + columns: + - name: bucket_duration_hours + data_type: number + description: "" + + - name: bucket_end + data_type: timestamp_ntz + description: "" + + - name: bucket_start + data_type: timestamp_ntz + description: "" + + - name: column_name + data_type: string + description: "" + + - name: dimension + data_type: string + description: "" + + - name: dimension_value + data_type: string + description: "" + + - name: full_table_name + data_type: string + description: "" + + - name: id + data_type: string + description: "" + + - name: metric_name + data_type: string + description: "" + + - name: metric_properties + data_type: string + description: "" + + - name: metric_value + data_type: float + description: "" + + - name: source_value + data_type: string + description: "" + + - name: updated_at + data_type: timestamp_ntz + description: "" + - name: metrics_anomaly_score description: > This is a view on `data_monitoring_metrics` that runs the same query the anomaly detection tests run to calculate anomaly scores. The purpose of this view is to provide visibility to the results of anomaly detection tests. + columns: + - name: anomaly_score + data_type: float + description: "" + + - name: bucket_end + data_type: timestamp_ntz + description: "" + + - name: bucket_start + data_type: timestamp_ntz + description: "" + + - name: column_name + data_type: string + description: "" + + - name: dimension + data_type: string + description: "" + + - name: dimension_value + data_type: string + description: "" + + - name: full_table_name + data_type: string + description: "" + + - name: id + data_type: string + description: "" + + - name: is_anomaly + data_type: boolean + description: "" + + - name: latest_metric_value + data_type: float + description: "" + + - name: metric_name + data_type: string + description: "" + + - name: training_avg + data_type: float + description: "" + + - name: training_end + data_type: timestamp_ntz + description: "" + + - name: training_set_size + data_type: number + description: "" + + - name: training_start + data_type: timestamp_ntz + description: "" + + - name: training_stddev + data_type: float + description: "" + + - name: updated_at + data_type: timestamp_ntz + description: "" + - name: anomaly_threshold_sensitivity description: > This is a view on `metrics_anomaly_score` that calculates if values of metrics from latest runs would have been considered anomalies in different anomaly scores. This can help you decide if there is a need to adjust the `anomaly_score_threshold`. + columns: + - name: anomaly_score + data_type: float + description: "" + + - name: column_name + data_type: string + description: "" + + - name: full_table_name + data_type: string + description: "" + + - name: latest_metric_value + data_type: float + description: "" + + - name: metric_avg + data_type: float + description: "" + + - name: metric_name + data_type: string + description: "" + + - name: metric_stddev + data_type: float + description: "" + + - name: is_anomaly_1_5 + data_type: boolean + description: "" + + - name: is_anomaly_2 + data_type: boolean + description: "" + + - name: is_anomaly_2_5 + data_type: boolean + description: "" + + - name: is_anomaly_3 + data_type: boolean + description: "" + + - name: is_anomaly_3_5 + data_type: boolean + description: "" + + - name: is_anomaly_4 + data_type: boolean + description: "" + + - name: is_anomaly_4_5 + data_type: boolean + description: "" + - name: monitors_runs description: > This is a view on `data_monitoring_metrics` that is used to determine when a specific anomaly detection test was last executed. Each anomaly detection test queries this view to decide on a start time for collecting metrics. + columns: + - name: column_name + data_type: string + description: "" + + - name: first_bucket_end + data_type: timestamp_ntz + description: "" + + - name: full_table_name + data_type: string + description: "" + + - name: last_bucket_end + data_type: timestamp_ntz + description: "" + + - name: metric_name + data_type: string + description: "" + + - name: metric_properties + data_type: string + description: "" + - name: schema_columns_snapshot description: > @@ -29,3 +233,31 @@ models: In order to compare current schema to previous state, we must store the previous state. The data is from a view that queries the data warehouse information schema. This is an incremental table. + columns: + - name: column_name + data_type: string + description: "" + + - name: column_state_id + data_type: string + description: "" + + - name: data_type + data_type: string + description: "" + + - name: detected_at + data_type: timestamp_ntz + description: "" + + - name: full_column_name + data_type: string + description: "" + + - name: full_table_name + data_type: string + description: "" + + - name: is_new + description: "" + data_type: boolean From 9ae03af51fb6bfc1d514d94c2e6855401462c773 Mon Sep 17 00:00:00 2001 From: Diorge Brognara Date: Mon, 5 Feb 2024 11:50:01 -0300 Subject: [PATCH 2/7] docs(alerts_views): columns draft Added empty descriptions for existing models under the alerts_views documentation. --- models/alerts_views.yml | 335 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 335 insertions(+) diff --git a/models/alerts_views.yml b/models/alerts_views.yml index 530bbd4c7..81ee42620 100644 --- a/models/alerts_views.yml +++ b/models/alerts_views.yml @@ -5,19 +5,354 @@ models: description: > A view that is used by the Elementary CLI to generate models alerts, including all the fields the alert will include such as owner, tags, error message, etc. It joins data about models and snapshots run results, and filters alerts according to configuration. + columns: + - name: alert_id + data_type: string + description: "" + + - name: alias + data_type: string + description: "" + + - name: database_name + data_type: string + description: "" + + - name: detected_at + data_type: string + description: "" + + - name: full_refresh + data_type: boolean + description: "" + + - name: materialization + data_type: string + description: "" + + - name: message + data_type: string + description: "" + + - name: original_path + data_type: string + description: "" + + - name: owners + data_type: string + description: "" + + - name: path + data_type: string + description: "" + + - name: schema_name + data_type: string + description: "" + + - name: status + data_type: string + description: "" + + - name: tags + data_type: string + description: "" + + - name: unique_id + data_type: string + description: "" + - name: alerts_dbt_tests description: > A view that is used by the Elementary CLI to generate dbt tests alerts, including all the fields the alert will include such as owner, tags, error message, etc. This view includes data about all dbt tests except elementary tests. It filters alerts according to configuration. + columns: + - name: alert_description + data_type: string + description: "" + + - name: alert_id + data_type: string + description: "" + + - name: alert_results_query + data_type: string + description: "" + + - name: alert_type + data_type: string + description: "" + + - name: column_name + data_type: string + description: "" + + - name: database_name + data_type: string + description: "" + + - name: data_issue_id + data_type: string + description: "" + + - name: detected_at + data_type: timestamp_ntz + description: "" + + - name: model_unique_id + data_type: string + description: "" + + - name: other + data_type: string + description: "" + + - name: owners + data_type: string + description: "" + + - name: result_rows + data_type: string + description: "" + + - name: schema_name + data_type: string + description: "" + + - name: severity + data_type: string + description: "" + + - name: status + data_type: string + description: "" + + - name: sub_type + data_type: string + description: "" + + - name: table_name + data_type: string + description: "" + + - name: tags + data_type: string + description: "" + + - name: test_execution_id + data_type: string + description: "" + + - name: test_name + data_type: string + description: "" + + - name: test_params + data_type: string + description: "" + + - name: test_short_name + data_type: string + description: "" + + - name: test_unique_id + data_type: string + description: "" + - name: alerts_anomaly_detection description: > A view that is used by the Elementary CLI to generate alerts on data anomalies detected using the elementary anomaly detection tests. The view filters alerts according to configuration. + columns: + - name: alert_description + data_type: string + description: "" + + - name: alert_id + data_type: string + description: "" + + - name: alert_results_query + data_type: string + description: "" + + - name: alert_type + data_type: string + description: "" + + - name: column_name + data_type: string + description: "" + + - name: database_name + data_type: string + description: "" + + - name: data_issue_id + data_type: string + description: "" + + - name: detected_at + data_type: timestamp_ntz + description: "" + + - name: model_unique_id + data_type: string + description: "" + + - name: other + data_type: string + description: "" + + - name: owners + data_type: string + description: "" + + - name: result_rows + data_type: string + description: "" + + - name: schema_name + data_type: string + description: "" + + - name: severity + data_type: string + description: "" + + - name: status + data_type: string + description: "" + + - name: sub_type + data_type: string + description: "" + + - name: table_name + data_type: string + description: "" + + - name: tags + data_type: string + description: "" + + - name: test_execution_id + data_type: string + description: "" + + - name: test_name + data_type: string + description: "" + + - name: test_params + data_type: string + description: "" + + - name: test_short_name + data_type: string + description: "" + + - name: test_unique_id + data_type: string + description: "" + - name: alerts_schema_changes description: > A view that is used by the Elementary CLI to generate alerts on schema changes detected using elementary tests. The view filters alerts according to configuration. + columns: + - name: alert_description + data_type: string + description: "" + + - name: alert_id + data_type: string + description: "" + + - name: alert_results_query + data_type: string + description: "" + + - name: alert_type + data_type: string + description: "" + + - name: column_name + data_type: string + description: "" + + - name: database_name + data_type: string + description: "" + + - name: data_issue_id + data_type: string + description: "" + + - name: detected_at + data_type: timestamp_ntz + description: "" + + - name: model_unique_id + data_type: string + description: "" + + - name: other + data_type: string + description: "" + + - name: owners + data_type: string + description: "" + + - name: result_rows + data_type: string + description: "" + + - name: schema_name + data_type: string + description: "" + + - name: severity + data_type: string + description: "" + + - name: status + data_type: string + description: "" + + - name: sub_type + data_type: string + description: "" + + - name: table_name + data_type: string + description: "" + + - name: tags + data_type: string + description: "" + + - name: test_execution_id + data_type: string + description: "" + + - name: test_name + data_type: string + description: "" + + - name: test_params + data_type: string + description: "" + + - name: test_short_name + data_type: string + description: "" + + - name: test_unique_id + description: "" + data_type: string From af2e23471a43b863c4180bbada0eaad0a6980efc Mon Sep 17 00:00:00 2001 From: Diorge Brognara Date: Mon, 5 Feb 2024 11:53:14 -0300 Subject: [PATCH 3/7] docs(run_results): columns draft Added empty descriptions for existing models under the run_results documentation. --- models/run_results.yml | 343 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 343 insertions(+) diff --git a/models/run_results.yml b/models/run_results.yml index 075c4bb12..d693b309a 100644 --- a/models/run_results.yml +++ b/models/run_results.yml @@ -228,21 +228,364 @@ models: Run results of all dbt tests, with fields and metadata needed to produce the Elementary report UI. Each row is the result of a single test, including native dbt tests, packages tests and elementary tests. New data is loaded to this model on an on-run-end hook named `elementary.handle_tests_results`. + columns: + - name: column_name + data_type: string + description: "" + + - name: database_name + data_type: string + description: "" + + - name: data_issue_id + data_type: string + description: "" + + - name: detected_at + data_type: timestamp_ntz + description: "" + + - name: failures + data_type: number + description: "" + + - name: id + data_type: string + description: "" + + - name: invocation_id + data_type: string + description: "" + + - name: model_unique_id + data_type: string + description: "" + + - name: other + data_type: string + description: "" + + - name: owners + data_type: string + description: "" + + - name: result_rows + data_type: string + description: "" + + - name: schema_name + data_type: string + description: "" + + - name: severity + data_type: string + description: "" + + - name: status + data_type: string + description: "" + + - name: table_name + data_type: string + description: "" + + - name: tags + data_type: string + description: "" + + - name: test_alias + data_type: string + description: "" + + - name: test_execution_id + data_type: string + description: "" + + - name: test_name + data_type: string + description: "" + + - name: test_params + data_type: string + description: "" + + - name: test_results_description + data_type: string + description: "" + + - name: test_results_query + data_type: string + description: "" + + - name: test_short_name + data_type: string + description: "" + + - name: test_sub_type + data_type: string + description: "" + + - name: test_type + data_type: string + description: "" + + - name: test_unique_id + data_type: string + description: "" + - name: model_run_results description: > Run results of dbt models, enriched with models metadata. Each row is the result of a single model. This is a view that joins data from `dbt_run_results` and `dbt_models`. + columns: + - name: alias + data_type: string + description: "" + + - name: compiled_code + data_type: string + description: "" + + - name: compile_completed_at + data_type: string + description: "" + + - name: compile_started_at + data_type: string + description: "" + + - name: database_name + data_type: string + description: "" + + - name: execute_completed_at + data_type: string + description: "" + + - name: execute_started_at + data_type: string + description: "" + + - name: execution_time + data_type: float + description: "" + + - name: full_refresh + data_type: boolean + description: "" + + - name: generated_at + data_type: string + description: "" + + - name: invocation_id + data_type: string + description: "" + + - name: is_the_first_invocation_of_the_day + data_type: boolean + description: "" + + - name: is_the_last_invocation_of_the_day + data_type: boolean + description: "" + + - name: materialization + data_type: string + description: "" + + - name: message + data_type: string + description: "" + + - name: model_execution_id + data_type: string + description: "" + + - name: model_invocation_reverse_index + data_type: number + description: "" + + - name: name + data_type: string + description: "" + + - name: original_path + data_type: string + description: "" + + - name: owner + data_type: string + description: "" + + - name: package_name + data_type: string + description: "" + + - name: path + data_type: string + description: "" + + - name: query_id + data_type: string + description: "" + + - name: schema_name + data_type: string + description: "" + + - name: status + data_type: string + description: "" + + - name: tags + data_type: string + description: "" + + - name: thread_id + data_type: string + description: "" + + - name: unique_id + data_type: string + description: "" + - name: snapshot_run_results description: > Run results of dbt snapshots, enriched with snapshots metadata. Each row is the result of a single snapshot. This is a view that joins data from `dbt_run_results` and `dbt_snapshots`. + columns: + - name: alias + data_type: string + description: "" + + - name: compiled_code + data_type: string + description: "" + + - name: compile_completed_at + data_type: string + description: "" + + - name: compile_started_at + data_type: string + description: "" + + - name: database_name + data_type: string + description: "" + + - name: execute_completed_at + data_type: string + description: "" + + - name: execute_started_at + data_type: string + description: "" + + - name: execution_time + data_type: float + description: "" + + - name: full_refresh + data_type: boolean + description: "" + + - name: generated_at + data_type: string + description: "" + + - name: invocation_id + data_type: string + description: "" + + - name: materialization + data_type: string + description: "" + + - name: message + data_type: string + description: "" + + - name: model_execution_id + data_type: string + description: "" + + - name: name + data_type: string + description: "" + + - name: original_path + data_type: string + description: "" + + - name: owner + data_type: string + description: "" + + - name: package_name + data_type: string + description: "" + + - name: path + data_type: string + description: "" + + - name: query_id + data_type: string + description: "" + + - name: schema_name + data_type: string + description: "" + + - name: status + data_type: string + description: "" + + - name: tags + data_type: string + description: "" + + - name: thread_id + data_type: string + description: "" + + - name: unique_id + data_type: string + description: "" + - name: job_run_results description: > Run results of dbt invocations, enriched with jobs metadata. Each row is the result of a single job. This is a view on `dbt_invocations`. + columns: + - name: id + data_type: string + description: "" + + - name: name + data_type: string + description: "" + + - name: run_completed_at + data_type: timestamp_ntz + description: "" + + - name: run_execution_time + data_type: number + description: "" + + - name: run_id + data_type: string + description: "" + + - name: run_started_at + description: "" + data_type: timestamp_ntz From bc66eb2c3922d30b6548e3cdc9a31c83fbeff7d7 Mon Sep 17 00:00:00 2001 From: Diorge Brognara Date: Tue, 20 Feb 2024 14:00:57 -0300 Subject: [PATCH 4/7] docs(dbt_artifacts): add seeds docs --- models/dbt_artifacts.yml | 66 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/models/dbt_artifacts.yml b/models/dbt_artifacts.yml index 29d7e2e78..aa69e39e4 100644 --- a/models/dbt_artifacts.yml +++ b/models/dbt_artifacts.yml @@ -488,3 +488,69 @@ models: - name: generated_at data_type: string description: "" + + - name: dbt_seeds + description: > + Metadata about seed in the project, with each record representing each declared seed. + Data is loaded every time this model is executed. + It is recommended to execute the model every time a change is merged to the project. + columns: + - name: alias + data_type: string + description: "" + + - name: checksum + data_type: string + description: "" + + - name: database_name + data_type: string + description: "" + + - name: description + data_type: string + description: "" + + - name: generated_at + data_type: string + description: "" + + - name: meta + data_type: string + description: "" + + - name: metadata_hash + data_type: string + description: "" + + - name: name + data_type: string + description: "" + + - name: original_path + data_type: string + description: "" + + - name: owner + data_type: string + description: "" + + - name: package_name + data_type: string + description: "" + + - name: path + data_type: string + description: "" + + - name: schema_name + data_type: string + description: "" + + - name: tags + data_type: string + description: "" + + - name: unique_id + data_type: string + description: "" From 0f0c36678a853dc1ace80163d840ce8f96d912a2 Mon Sep 17 00:00:00 2001 From: Diorge Brognara Date: Tue, 20 Feb 2024 14:05:57 -0300 Subject: [PATCH 5/7] docs(run_results): docs for freshness check runs --- models/run_results.yml | 60 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/models/run_results.yml b/models/run_results.yml index d693b309a..398262b60 100644 --- a/models/run_results.yml +++ b/models/run_results.yml @@ -589,3 +589,63 @@ models: - name: run_started_at description: "" data_type: timestamp_ntz + + - name: dbt_source_freshness_results + description: > + Run results of dbt freshness checks. + columns: + - name: compile_completed_at + data_type: string + description: "" + + - name: compile_started_at + data_type: string + description: "" + + - name: created_at + data_type: timestamp_ntz + description: "" + + - name: error + data_type: string + description: "" + + - name: execute_completed_at + data_type: string + description: "" + + - name: execute_started_at + data_type: string + description: "" + + - name: generated_at + data_type: string + description: "" + + - name: invocation_id + data_type: string + description: "" + + - name: max_loaded_at + data_type: string + description: "" + + - name: max_loaded_at_time_ago_in_s + data_type: float + description: "" + + - name: snapshotted_at + data_type: string + description: "" + + - name: source_freshness_execution_id + data_type: string + description: "" + + - name: status + data_type: string + description: "" + + - name: unique_id + data_type: string + description: "" From eb6d1906ae7b0fe0710586baf8ea0f0004ece10d Mon Sep 17 00:00:00 2001 From: Diorge Brognara Date: Tue, 20 Feb 2024 14:14:46 -0300 Subject: [PATCH 6/7] docs(alerts): docs for source freshness alert --- models/alerts_views.yml | 81 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/models/alerts_views.yml b/models/alerts_views.yml index 81ee42620..805e7d876 100644 --- a/models/alerts_views.yml +++ b/models/alerts_views.yml @@ -356,3 +356,84 @@ models: - name: test_unique_id description: "" data_type: string + + - name: alerts_dbt_source_freshness + - name: alert_id + data_type: string + description: "" + + - name: max_loaded_at + data_type: string + description: "" + + - name: snapshotted_at + data_type: string + description: "" + + - name: detected_at + data_type: string + description: "" + + - name: max_loaded_at_time_ago_in_s + data_type: float + description: "" + + - name: status + data_type: string + description: "" + + - name: error + data_type: string + description: "" + + - name: unique_id + data_type: string + description: "" + + - name: database_name + data_type: string + description: "" + + - name: schema_name + data_type: string + description: "" + + - name: source_name + data_type: string + description: "" + + - name: identifier + data_type: string + description: "" + + - name: freshness_error_after + data_type: string + description: "" + + - name: freshness_warn_after + data_type: string + description: "" + + - name: freshness_filter + data_type: string + description: "" + + - name: tags + data_type: string + description: "" + + - name: meta + data_type: string + description: "" + + - name: owner + data_type: string + description: "" + + - name: package_name + data_type: string + description: "" + + - name: path + data_type: string + description: "" From 21614b34f655177de1150821a2471bd5be161d91 Mon Sep 17 00:00:00 2001 From: Diorge Brognara Date: Tue, 20 Feb 2024 14:42:30 -0300 Subject: [PATCH 7/7] docs(dbt_artifacts): docs for dbt_columns --- models/dbt_artifacts.yml | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/models/dbt_artifacts.yml b/models/dbt_artifacts.yml index aa69e39e4..b9db449ea 100644 --- a/models/dbt_artifacts.yml +++ b/models/dbt_artifacts.yml @@ -554,3 +554,32 @@ models: - name: unique_id data_type: string description: "" + + - name: dbt_columns + description: > + View of all columns in dbt models. + Each row is the description of a single column. + columns: + - name: full_table_name + data_type: string + description: "" + + - name: database_name + data_type: string + description: "" + + - name: schema_name + data_type: string + description: "" + + - name: table_name + data_type: string + description: "" + + - name: column_name + data_type: string + description: "" + + - name: data_type + data_type: string + description: ""