Skip to content
Change the repository type filter

All

    Repositories list

    • cube

      Public
      📊 Cube — The Semantic Layer for Building Data Applications
      Rust
      1.9k000Updated Jun 19, 2025Jun 19, 2025
    • airbyte

      Public
      Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
      Python
      4.7k203Updated May 8, 2025May 8, 2025
    • airflow

      Public
      Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
      Python
      15k200Updated May 6, 2025May 6, 2025
    • trino

      Public
      Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
      Java
      3.3k000Updated Apr 9, 2025Apr 9, 2025
    • Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
      Python
      238000Updated Feb 19, 2025Feb 19, 2025
    • Python
      5000Updated Feb 11, 2025Feb 11, 2025
    • The athena adapter plugin for dbt (https://getdbt.com)
      Python
      107001Updated Apr 25, 2024Apr 25, 2024
    • muttlib

      Public
      Python
      63020Updated Dec 15, 2023Dec 15, 2023
    • Docker Image for MLFlow Cluster
      Shell
      2000Updated Dec 13, 2023Dec 13, 2023
    • Workshop Big Data en Español
      HTML
      59501Updated Aug 25, 2023Aug 25, 2023
    • soam

      Public
      the forecasting framework you will love
      Jupyter Notebook
      11101Updated Jun 21, 2023Jun 21, 2023
    • .github

      Public
      A repo to put all the info about us.
      0000Updated May 3, 2023May 3, 2023
    • Data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes.
      Python
      4.7k002Updated Mar 17, 2023Mar 17, 2023
    • 0000Updated Feb 22, 2023Feb 22, 2023
    • Jupyter Notebook
      2000Updated Jul 26, 2022Jul 26, 2022
    • Always know what to expect from your data.
      Python
      1.6k000Updated May 9, 2022May 9, 2022
    • ⚙️ Kubernetes Helm Charts
      Smarty
      4101Updated Apr 28, 2022Apr 28, 2022
    • dbt-core

      Public
      dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
      Python
      1.8k000Updated Apr 25, 2022Apr 25, 2022
    • dbt-athena

      Public archive
      The athena adapter plugin for dbt (https://getdbt.com)
      Python
      76000Updated Feb 16, 2022Feb 16, 2022
    • tap-mambu

      Public
      Python
      25101Updated Jan 28, 2022Jan 28, 2022
    • A terraform module to productionalize MLflow on top of AWS (Fargate + Aurora Serverless + S3)
      HCL
      21000Updated Sep 8, 2021Sep 8, 2021
    • Python
      24000Updated Aug 31, 2021Aug 31, 2021
    • search hack 2020
      Python
      6000Updated Aug 13, 2021Aug 13, 2021
    • objinsync

      Public
      Continuously synchronize directories from remote object store to local filesystem
      Go
      21000Updated Apr 22, 2021Apr 22, 2021
    • Python
      7000Updated Mar 1, 2021Mar 1, 2021
    • Write Singer data to JSONL files
      Python
      17000Updated Feb 18, 2021Feb 18, 2021
    • Code for Mutt's MLFlow blog post
      Jupyter Notebook
      1600Updated Feb 12, 2021Feb 12, 2021
    • Shared software among connectors that target distributed filesystems and cloud storage.
      Java
      155000Updated Feb 1, 2021Feb 1, 2021
    • Kafka Connect suite of connectors for Cloud storage (Amazon S3)
      Java
      341000Updated Feb 1, 2021Feb 1, 2021
    • darts

      Public
      A python library for easy manipulation and forecasting of time series.
      Python
      949000Updated Sep 7, 2020Sep 7, 2020