ETL_Pipeline_Development_for_Wel_Logs_Analysis

In this project, I developed and implemented an ETL pipeline for processing well log data using advanced data engineering technologies in Databricks, Spark Structured Streaming and Delta Live Tables (DLT). The pipeline enabled the ingestion, validation, cleaning, and transformation of large volumes of well data, facilitating the extraction of valuable information for geophysical analysis.

Responsibilities:

Implemented automated pipelines for converting LAS files to JSON format and loading them into Delta Lake.
Transformed data using PySpark and SQL, including the creation of new metrics and data validation.
Optimized pipelines by utilizing partitioning and Delta Tables.
Developed interactive dashboards for visualizing and monitoring processed data.
Automated real-time data ingestion processes.

Technologies: Delta Live Tables, Spark Structured Streaming, Auto Loader, SQL, PySpark.

ETL processes

This diagram presents a detailed representation of the complete ETL process implemented to accomplish the proposed objectives for this project.

Graphycs

This graphyc shows some logs (i.e. GR, ILD, ILM and NPHI) from well 20_8_7.

Here you can find, for each available log, metrics such as the maximum, minimum, mean, standard deviation, and the range of valid measurements (i.e., only non-null values), among others.

DLT pipeline

This is the part that allows us to orchestrate the execution of DLT tasks (DLT decorator & functions) defined in the notebooks. Here you can see created tables and their lineage.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
medallion_architecture		medallion_architecture
set-up		set-up
DTL_pipeline.png		DTL_pipeline.png
README.md		README.md
Selected_Curves.png		Selected_Curves.png
Statistics.png		Statistics.png
Streaming_ETL_processes.png		Streaming_ETL_processes.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ETL_Pipeline_Development_for_Wel_Logs_Analysis

ETL processes

Graphycs

DLT pipeline

About

Uh oh!

Releases

Packages

Languages

Rogelio-Bustamante/ETL_Pipeline_Development_for_Wel_Logs_Analysis

Folders and files

Latest commit

History

Repository files navigation

ETL_Pipeline_Development_for_Wel_Logs_Analysis

ETL processes

Graphycs

DLT pipeline

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages