This repository was created as an assignment for the course Advanced Database System Concepts (NTUA) (2024-25) The project focuses on analyzing Los Angeles crime data from 2010 to 2024 primarily based on
- LA crime data 2010-2019
- LA crime data 2020-now
- Other datasets which are defined in the notebook.
The Introduction
section of the notebook contains Execution Instructions
- store the datasets required for data analysis and update the links that refer to them
- install PySpark, Sedona