Skip to content

Exploratory Data Analysis project using Python, pandas, and visualization techniques to examine 110,000+ medical appointments in Brazil, uncovering patterns and potential drivers behind patient no-shows. Built as part of Udacity's Data Analyst Nanodegree.

Notifications You must be signed in to change notification settings

techwithhams/No-Show-Appointments-EDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ₯ No-show Appointments EDA Project

Python + pandas + matplotlib + seaborn | Udacity Nanodegree Project

This exploratory data analysis (EDA) project investigates over 110,000 medical appointment records from Brazil to uncover the factors that influence whether patients show up for their scheduled appointments.


🎯 Project Objective

To analyze the No-show Appointments dataset and identify behavioral and demographic patterns affecting patient attendance, using Python and core EDA techniques.


🧰 Tools & Skills

  • 🐍 Language: Python 3
  • πŸ“¦ Libraries: pandas, numpy, matplotlib, seaborn
  • πŸ“Š Techniques: Data wrangling, visualization, exploratory data analysis, statistical pattern recognition

πŸ“‚ Dataset

  • πŸ“ Source: Kaggle - No-show Appointments
  • πŸ“ˆ Records: 110,527 patient appointments
  • πŸ”‘ Key Columns:
    • PatientId, ScheduledDay, AppointmentDay, Age, Gender, Neighbourhood
    • Hypertension, Diabetes, Scholarship, SMS_received
    • No-show (target variable)

πŸ“ Project Structure

no-show-eda/
β”œβ”€β”€ investigate_no_show.ipynb     # Jupyter notebook with analysis
β”œβ”€β”€ no_show_appointments.csv      # Raw dataset
└── README.md                     # Project overview and insights

πŸ” Key Insights

  • ❌ 20% of all appointments were missed (No-show = Yes)
  • πŸ‘Ά Younger patients (ages 0–20) had the highest no-show rates
  • βœ‰οΈ SMS reminders were slightly effective in improving attendance
  • πŸ’Š Patients with hypertension or diabetes tended to show up more consistently
  • 🏘️ Some neighbourhoods showed significantly higher no-show rates than others

πŸ’‘ Recommendations

  • πŸ‘₯ Youth Outreach: Launch education or incentive programs targeting younger patients
  • πŸ“² Improve SMS Strategy: Test better timing, personalization, or alternative messaging formats
  • πŸ—ΊοΈ Location-Based Focus: Tailor local outreach campaigns in low-attendance areas
  • βš•οΈ Patient Engagement: Design programs that keep healthier patients involved in preventive care

πŸš€ How to Use

  1. Clone this repository or download the notebook and dataset.
  2. Open investigate_no_show.ipynb in Jupyter or VS Code.
  3. Run each cell to walk through data cleaning, analysis, and visualization.
  4. Modify filters, columns, or visualizations to explore new insights.

πŸ“¬ Author

Hams Saeed Alhakim
πŸ“š Udacity Data Analyst Nanodegree
πŸ”— GitHub
πŸ“… Year: 2025

About

Exploratory Data Analysis project using Python, pandas, and visualization techniques to examine 110,000+ medical appointments in Brazil, uncovering patterns and potential drivers behind patient no-shows. Built as part of Udacity's Data Analyst Nanodegree.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published