Skip to content
View caogiathinh's full-sized avatar
🥇
Focusing
🥇
Focusing

Block or report caogiathinh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
caogiathinh/README.md

Cao Gia Thịnh

Data Engineer

Typing SVG

Welcome to my GitHub profile!

I'm Cao Gia Thinh, a final-year Computer Science student with a deep focus on Data Engineering. I am passionate about designing and building scalable, high-performance data systems that transform raw data into valuable insights to support business decision-making.


📊 GitHub Stats

🛠️ Tech Stack & Core Competencies

Python SQL Apache Spark dbt Google Cloud Docker PostgreSQL Git Kestra


🚀 Key Projects

These are my flagship projects that showcase my skills and experience.

Built a complete data platform on Google Cloud to collect, process, and analyze retail data from various sources.

  • Orchestration: Leveraged Kestra (deployed on Cloud Composer) to schedule and orchestrate data ingestion pipelines from parquet files.
  • Data Lake & Warehouse: Stored raw data in Google Cloud Storage (GCS). Subsequently, cleaned, transformed, and loaded the data into Google BigQuery using Apache Spark.
  • Data Modeling: Implemented a Star Schema within BigQuery to optimize for analytical queries.
  • Deployment: Containerized the entire application and its dependencies using Docker to ensure consistency across environments.

Technologies: GCP (BigQuery, GCS, Composer), Kestra, Apache Spark, Docker, Python, SQL, dbt, Google Data Studio.


Designed and implemented a modern data warehouse to empower Sales and Marketing teams with advanced analytics.

  • ETL & Transformation: Using SQL to extract, transform, and load from source to destination data warehouse.
  • Data Warehouse Design: Architected a DWH schema on Microsoft SQL Server.

Technologies: T-SQL, MS SQL SERVER.

📫 Let's Connect!

I'm always open to discussing new opportunities, interesting projects, or anything related to data and technology. Feel free to reach out!

LinkedIn Email

****

Pinned Loading

  1. urban-mobility-elt-pipeline urban-mobility-elt-pipeline Public template

    Built a complete end-to-end data platform to ingest, process, and analyze complex, multi-source public datasets for business intelligence.

    Python 9

  2. modern-data-warehouse modern-data-warehouse Public

    Building a modern data warehouse with SQL Server, including ETL processes, data modeling and analytics

    TSQL 8

  3. sql sql Public

    A repository for everything about SQL and Databases. Covers SQL queries from basic to advanced, database design, and normalization forms.

    3

  4. data-structures-algorithms data-structures-algorithms Public

    A repository for Data Structures and Algorithms (DSA) implemented in C++. A place to practice and reinforce fundamental concepts: Graph data structure, sort algorithm.

    7

  5. object-oriented-java object-oriented-java Public

    Object Oriented Programming with Java Languages.

    Java 1

  6. database-craftsman database-craftsman Public

    Forked from doit-now/database-craftsman

    TSQL 5