Skip to content

Data-driven waste management study using PySpark: explores policy, descriptions, and real-world datasets to analyze waste streams and inform sustainable interventions.

Notifications You must be signed in to change notification settings

lamarojas/EcoSort_WasteManagement

Repository files navigation

EcoSort: Waste Management Summative

Overview This project delivers a data-driven exploration of waste management dynamics through integrated datasets—including policy documents and waste category descriptions—with scalable preprocessing using PySpark for efficient, large-scale analysis.

Highlights PySpark-based data ingestion for scalable handling of diverse waste datasets.

Integration of descriptive taxonomies (waste_descriptions.csv) and regulatory context (waste_policy_documents.json) for richer analysis.

Analytical workflows in Jupyter Notebook (waste_management_summative.ipynb) to derive actionable insights into waste streams, policies, and sustainability themes.

Purpose & Value Ideal for demonstrating:

Large-scale data processing proficiency (PySpark)

Real-world dataset integration and exploratory workflow design

Strong foundation in environmental informatics relevant to policy and urban planning roles

Technologies PySpark – Scalable data handling

Pandas, NumPy – Data manipulation and analysis

Jupyter Notebook – Narrative-driven workflow and visualization

About

Data-driven waste management study using PySpark: explores policy, descriptions, and real-world datasets to analyze waste streams and inform sustainable interventions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published