- 📌 Project Overview
- 🎯 Objectives
- 🗂 Dataset
- 🔍 Key Findings
- 🛠️ Tools & Libraries
- 📈 Example Insights
- 📂 Repository Structure
- 🚀 Next Steps
- 👩💻 Author
Every year, millions of consumers in the U.S. file complaints about financial products and services.
These complaints provide valuable insights into systemic issues such as unfair practices, delays, incorrect data, or product failures.
This project explores the Consumer Financial Protection Bureau (CFPB) Consumer Complaints dataset, focusing on complaint trends, company responses, and regional patterns.
- Identify Trends & Patterns → Track complaint volumes and issue types over time.
- Understand Submission & Response → Explore complaint channels and response timeliness.
- Analyze Geographic & Demographic Impact → Compare across U.S. regions.
- Evaluate Company Complaint Rates → Identify companies with high complaint numbers and assess strategies.
- Assess Resolution Outcomes → Examine closure types and dispute rates.
- Source: CFPB Consumer Complaints Database
- Original Size: ~7 GB
- Processed Size: < 2 GB (filtered by year, selected relevant columns, removed redundancy).
-
Top Issues:
- Credit Reporting & Debt Collection dominate.
- Incorrect data and false debt claims are main concerns.
-
Regional Differences:
- 🗽 New York → Debt collection.
- 🌉 California → Credit reporting.
- 🌴 Florida → Mix of credit reporting, mortgage, student loans.
- 🤠 Texas → Credit card issues.
-
Company Responses:
- Most respond timely.
- Heavy reliance on Closed with Explanation.
- Missing dispute data prevents full satisfaction analysis.
- Python (Jupyter Notebook)
pandas
,numpy
→ data wranglingmatplotlib
,seaborn
→ visualizationplotly
→ interactive charts
-
Complaints occur at a massive scale:
- ⏱ 1 every 20 seconds
- ⏳ 183 every hour
- 📅 ~1.6 million per year
-
Key takeaway: High timeliness ≠ guaranteed customer satisfaction.
ConsumerComplaints_EDA/
│── media/ # All plots & visualizations
│ ├── barplot.png
│ ├── complaint_summary.png
│ ├── heatmap.png
│ ├── lineplot.png
│ ├── newplot.png
│ ├── output.png
│ ├── q3.png
│ ├── q4.png
│ ├── timely_response.png
│ └── timely_response_2.png
│
│── Consumer Complaints.ipynb # Jupyter notebook with full EDA
│── Consumer Complaints USA EDA.pdf # Project presentation/report
│── EDA_SQL_queries.sql # SQL queries used in analysis
│── LICENSE # Project license (MIT, etc.)
│── README.md # Project documentation
📌 Future improvements and extensions for this project:
- 🔹 Dispute Resolution Analysis → Incorporate and evaluate dispute data if available.
- 🔹 Sentiment Analysis → Apply NLP techniques to analyze consumer complaint narratives.
- 🔹 Predictive Modeling → Use machine learning to predict complaint outcomes.
- 🔹 Time-Series Forecasting → Forecast complaint volumes to anticipate future trends.
- 🔹 Interactive Dashboard → Build a web-based dashboard (Plotly/Dash or Streamlit) for dynamic exploration.
Katherine Torian
📊 Exploratory Data Analysis – Consumer Complaints USA
🌐 Find me online: