Skip to content

RyanGA09/DataAnalyst-ImpactAnalysisOfMonkeypoxCaseStudy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧪 Data Analyst - Impact Analysis of Monkeypox Case Study

License: MIT License Made with Python Made with Jupyter Notebook

This project was developed as part of my data analyst portfolio to demonstrate my skills in data preprocessing, exploratory data analysis (EDA), and data visualization using a real-world public health dataset. The case study focuses on the impact and spread of Monkeypox across regions, aiming to extract actionable insights from time-series and geographical patterns.

🚀 Features

  • Data Preprocessing: Loading, cleaning, and transforming raw data for analysis.
  • Descriptive Statistics: Overview and statistical description of Monkeypox cases by country and region.
  • Data Visualization: Time-series plots, Bar charts, Line plots, Annotated visualizations, and Tables to visualize trends and distributions.

🛠️ Technologies Used

Pandas Matplotlib Seaborn Pathlib OS Python

▶️ How to Use the Program

  1. Clone the Repository:

    git clone https://github.com/RyanGA09/DataAnalyst-ImpactAnalysisOfMonkeypoxCaseStudy.git
  2. Create a Virtual Environment:

    python -m venv venv
  3. Activate the Virtual Environment:

    • On Windows:
    venv\Scripts\activate
    • On macOS and Linux:
    source venv/bin/activate
  4. Install Required Packages:

    pip install -r requirements.txt
  5. Filter Data

    Before working on the notebooks, filter the datasets using the provided Python script. You can execute the script by running:

    python filter_monkeypox_data.py

    This will process the raw data and generate the necessary filtered datasets for further analysis.

  6. Open the Jupyter Notebook:

    • To perform business understanding, gather data, and data cleaning, open the notebook located in the analysis_processing directory, which focuses on data processing tasks:

      jupyter notebook notebooks/analysis_processing/{start_year}_{start_month}_to_{end_year}_{end_month}/Notebook.ipynb
    • To conduct Exploratory Data Analysis (EDA) and data visualization, open the notebook in the visualization subdirectory, which is focused on further analysis and visual representation of the data:

      jupyter notebook notebooks/visualization/Notebook_visualization_{start_year}_{start_month}_to_{end_year}_{end_month}.ipynb
  7. Run the Cells:

    • In the analysis_processing notebooks, execute each cell sequentially to perform data cleaning, data preparation, and business understanding steps.
    • In the visualization notebooks, run each cell to conduct exploratory data analysis (EDA), and create data visualizations based on the processed data.

📊 Dataset Information

🔗 Data Source

The dataset used for this project is available in the data/raw/original directory or can be downloaded from the Monkeypox Data Source.

📥 Downloading and Placing the Data

  1. Download Data: If you choose to download the data, you can obtain it from the Monkeypox Data Source.
  2. Placing the Data: After downloading, ensure that the data file is placed in the data/raw/original/ directory. This is where the original, unprocessed data should be stored before any filtering or processing takes place.

Data Filtering Process

Once the original data file is stored in the data/raw/original directory, it will be processed and filtered as per the requirements of the analysis. The filtered data will be saved in the data/raw/filtered directory. The filtering process includes:

  • Removing irrelevant or incomplete data
  • Selecting relevant subsets of data for further analysis
  • Optimizing the data format and quality to meet the needs of the project

🔎 Data Processing and Analysis

After the filtered data is prepared in the data/filtered directory, it will undergo further processing and analysis. The processed data, which is used for Exploratory Data Analysis (EDA) and visualization, will be stored in the data/processed directory. This stage includes:

  • Conducting exploratory data analysis (EDA) to uncover patterns, trends, and insights
  • Cleaning the data further, if needed, for visualization and statistical analysis
  • Generating data visualizations to better understand the trends and relationships in the dataset

🗂️ Project Structure

ImpactAnalysisOfMonkeypoxCaseStudy/
│
├── data/                                                      # Contains the datasets
│   ├── processed/                                             # Contains the processed data, used for EDA and visualization.
│   └── raw/                                                   # Contains the original and filtered data directory
│        ├── filtered/                                         # Contains the filtered data, ready for analysis.
│        └── original/                                         # Contains the original, unfiltered & unprocessed data.
├── notebooks/                                                 # Contains the jupyter notebooks code
├── filter_monkeypox_data.py                                   # Python code for filtering dataset
├── README.md                                                  # Project documentation and usage instructions
└── requirements.txt                                           # List of required Python libraries

📖 Read More

Check out my article on Medium:

Medium

You can check the visualization result from my Tableau Dashboard on the badge below:

Tableau

☕ Support Me

This is a non-commercial project. If you find it useful and would like to support the development of this project, you can donate via the links below. Your support helps improve the project, but it does not grant any commercial rights over the project itself.

Saweria

📜 License

This project is licensed under the MIT License. It is for personal, academic, and non-commercial use only. Any commercial use is prohibited without explicit written permission from the author.

See the LICENSE file for more details.

Copyright © 2024 Ryan Gading Abdullah. All rights reserved.

📧 Contact

For commercial inquiries, please contact:

Gmail

Or reach me on LinkedIn:

LinkedIn

Releases

No releases published

Packages

No packages published