Pipeline Flowchart

How the Pipeline Works

A data pipeline has been designed using the Binance API and Google Cloud Platform. In the extract_data.py script, hourly data for the BTCUSD coin between May 28, 2023, and June 8, 2023, was extracted using the Binance API. Subsequently, the extracted data was converted to CSV format.

A bucket was created using Google Cloud SDK Shell. IAM was used to set up the necessary permissions for the bucket. Then, the load_data_to_bucket.py script was used to upload the CSV data to the Cloud Storage Bucket.

In the dataflow_pipeline.py script, a workflow was created using Apache Beam and the necessary settings were made to run this workflow on Dataflow. The purpose of this workflow is to read the CSV file in Cloud Storage and write the data to BigQuery after making the necessary transformations. Using the ParDo class, open_time and close_time data were converted from epoch time to real time. Then the data flow steps were created. First, the CSV file was read from the Cloud Storage Bucket. Then the CSV was parsed and the necessary transformations were made. Finally, the data was written to the BigQuery table.

The cloud components of the project were moved into containers using Docker. The dataflow_pipeline.py script was also moved into a container and executed there. An image was created using a Dockerfile for this purpose. Then, a container was started from the created image, and the script was executed.

Dataflow Diagram

Data written into BigQuery

Data Visualization with Looker

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
img		img
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
dataflow_pipeline.py		dataflow_pipeline.py
extract_data.py		extract_data.py
load_data_to_bucket.py		load_data_to_bucket.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pipeline Flowchart

How the Pipeline Works

Dataflow Diagram

Data written into BigQuery

Data Visualization with Looker

About

Uh oh!

Releases

Packages

Uh oh!

Languages

10urbagci/Binance-Data-Pipeline

Folders and files

Latest commit

History

Repository files navigation

Pipeline Flowchart

How the Pipeline Works

Dataflow Diagram

Data written into BigQuery

Data Visualization with Looker

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages