Skip to content

Exploratory data analysis and hypothesis testing on Chicago taxi trip data to uncover patterns in demand and the effects of rainy weather on travel time.

Notifications You must be signed in to change notification settings

caesaredia/chicago-taxi-data-insights

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Chicago Taxi Trip Analysis and Hypothesis Testing

This project performs exploratory data analysis (EDA) and hypothesis testing using datasets related to taxi trips in Chicago. The main focus includes identifying demand hotspots, understanding taxi company performance, and evaluating the impact of rainy weather on trip durations from downtown Chicago (Loop) to O’Hare International Airport.


πŸ“Š Project Objectives

1. Exploratory Data Analysis

  • Analyze taxi companies based on trip volumes on November 15–16, 2017.
  • Identify the top 10 drop-off areas in Chicago by average number of trips during November 2017.
  • Visualize distribution of trips per taxi company and per drop-off area.

2. Hypothesis Testing

  • Null Hypothesis (Hβ‚€): The average trip duration from the Loop to O’Hare International Airport does not change on rainy Saturdays.
  • Alternative Hypothesis (H₁): The average trip duration from the Loop to O’Hare International Airport changes on rainy Saturdays.
  • Method: Independent t-test with a significance level (Ξ±) of 0.05.

πŸ“ Datasets

All datasets used in this project are stored in the data/ directory:

  • data/chicago_taxi_data_01.csv: Taxi company data for Nov 15–16, 2017.
  • data/chicago_taxi_data_02.csv: Average drop-off volume per area.
  • data/chicago_taxi_data_03.csv: Trip records from Loop to O’Hare with weather data.

πŸ“ˆ Key Findings

  • Flash Cab had the highest number of trips on Nov 15–16, indicating strong demand or superior service.
  • Loop, River North, and Streeterville were the top destinations, reflecting their high popularity.
  • Hypothesis testing revealed a statistically significant increase in trip duration on rainy Saturdays, supporting the alternative hypothesis.

πŸ› οΈ Technologies Used

  • Python
  • Pandas, NumPy
  • Matplotlib, Seaborn
  • SciPy (for statistical testing)
  • Jupyter Notebook

Author

Nabilla Hafsah Caesaredia


πŸ“Œ How to Run

  1. Clone this repository:
    git clone https://github.com/yourusername/chicago-taxi-analysis.git

About

Exploratory data analysis and hypothesis testing on Chicago taxi trip data to uncover patterns in demand and the effects of rainy weather on travel time.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published