Skip to content

This repository provides a collection of automated processes designed to streamline financial data analysis, including Data Collecting & Cleaning from Yahoo Finance, EDA, Financial Features Calculation and Data Visualization.

License

Notifications You must be signed in to change notification settings

SasySpanish/Financial-Data-Analysis-Automation-with-Python

Repository files navigation

Financial Data Analysis Automation with Python

Automated workflows for end-to-end financial data analysis using Python.


Overview

This repository provides a collection of automated processes designed to streamline financial data analysis.
It is structured into modular phases, each contained in its own folder.

The goal is to create a standardized, repeatable, and efficient pipeline that transforms raw financial data into actionable insights — covering everything from data cleaning to visualization.


Repository Structure

Financial-Data-Analysis-Automation-with-Python/
│
├── 1) Data Cleaning/                 → Collecting (from Yahoo Finance), cleaning and preparing raw data
├── 2) Exploratory Data Analysis (EDA)/ → Automated data exploration and profiling
├── 3) Feature Engineering/           → Feature creation and transformation
├── 4) Data Visualization/            → Graphs, plots, and visual reports
└── README.md                        → Main repository documentation

Process Descriptions

Data Cleaning

  • Removes or imputes missing values.
  • Converts data types and formats (e.g. dates, currencies).
  • Detects and removes duplicates and outliers.
  • Produces a clean dataset ready for EDA and modeling.

Output: cleaned_data.csv (or equivalent)


Exploratory Data Analysis (EDA)

  • Automatically generates summary statistics (mean, median, variance, etc.).
  • Creates visual profiles — histograms, boxplots, correlations, and heatmaps.
  • Identifies patterns, anomalies, and data relationships.

Output: Automated HTML or image reports summarizing dataset characteristics.


Feature Engineering

  • Builds new derived features (e.g. moving averages, ratios, lags).
  • Performs feature selection and transformations (normalization, encoding).
  • Creates datasets ready for machine learning or statistical modeling.

Output: features_dataset.csv or model-ready DataFrame.


Data Visualization

  • Produces static plots and dashboards to visualize trends and results.
  • Focuses on financial insights: time series, volatility, returns, risk indicators.
  • Enables data-driven storytelling through visuals.

Output: .png, .html, or .ipynb visual summaries.


How to Use

  1. The main dataset is automatically downloaded in the src folder using Yahoo Finance.
  2. You can also place any custom dataset you want to analyze inside the data_raw folder — the cleaning scripts will handle it automatically.
  3. Run scripts sequentially in each folder:
    • 1_Data_Cleaning → clean and standardize data.
    • 2_Exploratory_Data_Analysis_(EDA) → explore and summarize.
    • 3_Feature_Engineering → create new features.
    • 4_Data_Visualization → visualize results.
  4. Review generated files and outputs.
  5. Customize scripts for your dataset or add new modules.

Requirements

  • Python 3.x
  • Main libraries:
    pandas
    numpy
    matplotlib
    seaborn
    scikit-learn
    yfinance
  • (Optional) Jupyter Notebook or Spyder for interactive execution.
  • Dependencies can be listed in requirements.txt if available.

Why Automation?

Benefit Description
Repeatability Consistent results across multiple datasets
Efficiency Saves time on routine tasks
Scalability Easy to extend with new modules
Transparency Each step is clearly separated and documented

Customization

You can easily extend the project by:

  • Adding new cleaning or preprocessing rules.
  • Including more EDA visualizations or statistical tests.
  • Developing advanced feature engineering functions (e.g., volatility metrics, rolling stats).
  • Integrating visualization dashboards using Plotly Dash, Streamlit, or Power BI.

Author

GitHub: @SasySpanish


About

This repository provides a collection of automated processes designed to streamline financial data analysis, including Data Collecting & Cleaning from Yahoo Finance, EDA, Financial Features Calculation and Data Visualization.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published