This repository contains two data analysis projects: one focusing on NYPD data and the other on COVID-19 data. Both projects utilize R for data manipulation, analysis, and visualization.
This project analyzes NYPD data to uncover trends and insights related to crime in New York City.
- Tidyverse
- Lubridate
The following CSV files are imported for analysis:
NYPD_Shooting_Incident_Data__Historic_csv
- Removed unnecessary columns.
- Created New Columns such as Day/Night
- New columns for Day / Month / Year
- fixed the object type of some variables such as occur_time and occur_data by using the lubridate mutate function
- Filtered out irrelevant observations.
- Summarized the data to include only relevant observations.
- Visualized crime trends over time.
- Analyzed crime distribution by borough and precinct.
This project analyzes COVID-19 data to understand the spread and impact of the pandemic globally and in the US.
- Tidyverse
- Lubridate
The following CSV files are imported for analysis:
time_series_covid19_confirmed_US.csv
time_series_covid19_confirmed_global.csv
time_series_covid19_deaths_US.csv
time_series_covid19_deaths_global.csv
UID_ISO_FIPS_LookUp_Table.csv
- Removed unnecessary columns.
- Transformed date columns into a single
date
column. - Created new columns for analysis, such as
cases
anddeaths
.
- Filtered out observations with zero cases.
- Summarized the data to include only relevant observations.
- Visualized cases and deaths over time.
- Summarized total cases and deaths by state and country.
- Clone the repository.
- Install the required libraries.
- Run the R scripts to reproduce the analysis and visualizations for both projects.
- Key insights and trends related to crime in NYC.
- Last date in the dataset:
2023-03-09
- Maximum number of cases:
103,802,702
- Maximum number of deaths:
1,123,836