Skip to content

Repos for learning and getting hands on experience with dbt. Following the dbt online courses: dbt Fundamentals and Refactoring SQL for Modularity.

Notifications You must be signed in to change notification settings

jacob-mennell/dbtLearn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Learn dbt Project

Repository aims to serve as valuable learning resources to grasp the fundamentals of dbt.

Introduction to dbt (data build tool):

dbt, or data build tool, is an open-source command-line tool that empowers analysts and data engineers to transform data in their data warehouse more effectively. It takes a modular and version-controlled approach to data transformation, enabling teams to collaboratively build, maintain, and document data pipelines. dbt simplifies the process of writing SQL code, organizing it into structured models, and managing dependencies.

Definitions of dbt Concepts:

  1. Models: Models in dbt are SQL files that define the transformations and calculations applied to source data to create structured and refined tables. Models encapsulate business logic and calculations in SQL queries and enable data transformation to be defined in a modular and easily maintainable way.

  2. Sources: Sources in dbt are raw or lightly transformed tables or views from your data warehouse. They serve as the foundation for building models. By defining sources, dbt can abstract the complexity of accessing and loading raw data, allowing analysts to focus on creating meaningful transformations.

  3. Seeds: Seeds in dbt are the initial data sets that are loaded into your data warehouse. They are often used for reference tables or dimensions. Seeds enable you to start your analysis with predefined data, streamlining the initial stages of your data transformation process.

  4. Snapshots: Snapshots in dbt are historical representations of your data that capture the state of your models at a specific point in time. They are useful for creating time-series analysis or preserving snapshots of data for auditing purposes.

  5. Macros: Macros in dbt are reusable SQL code snippets that can be invoked across models and other dbt files. They help standardize and simplify complex transformations, allowing you to achieve consistency and reusability in your codebase.

  6. Tests: Tests in dbt are SQL expressions that are used to verify the quality and accuracy of your data transformations. They allow you to identify data anomalies, inconsistencies, and errors. Tests can include checks for data types, null values, duplicates, and more, ensuring that the transformed data meets expected standards.

  7. Docs: dbt generates documentation for your data transformations automatically. Documentation includes descriptions of models, columns, sources, snapshots, tests, seeds, and macros, making it easier for team members to understand the purpose and characteristics of each component in your data transformation process. Documentation helps maintain transparency and collaboration within the team.

Summary

By utilizing dbt's modeling, sourcing, testing, snapshotting, seeding, macro, and documentation capabilities, data teams can streamline their data transformation workflows, improve data quality, and enhance collaboration among team members.

Code Clean Up Tips:

  • First list import CTEs e.g. orders as, payments as (these should be relatively simple e.g. selects with where statemetns), then Logital CTEs, then Final CTEs, then simple SELECT statement.
  • Implement CTE with WITH to organise code
  • Rick click, command palette, make everything lowercase. Supports consistent code format.

Resources:

  • Learn more about dbt in the docs
  • Check out Discourse for commonly asked questions and answers
  • Join the dbt community to learn from other analytics engineers
  • Find dbt events near you
  • Check out the blog for the latest news on dbt's development and best practices

About

Repos for learning and getting hands on experience with dbt. Following the dbt online courses: dbt Fundamentals and Refactoring SQL for Modularity.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •