Spatial Data Merge

This repository contains community context data (2017-2022) for the Everday Respect project, and the code needed to merge these data at the reporting district level. Where applicable, artifacts representing the analyses outputs are included. A data dictionary is available for future analysis and use on Airtable (ask team member for access) and statically here: LAPD Data Dictionary.xlsx

Getting started

This repository is organized into two folders: data, output. The data folder contains sub-folders.

Unzip all files in the directory data and any subfolders. These files are too large to be pushed into GitHub unzipped, but the code will not run without them.

Running analysis

To run the code, Jupyter Notebook and Python are required. All code is contained within the Jupyter Notebook file spatial_merge.ipynb organized with headings and sub-headings. To run this file, install all package requirements (under Imports) using pip or your preferred package installer. You may need to restart the kernel to access the packages.

No other modifications are required to run the script.

The output figures and data files will be added to the output directory.

Methods and use

We used an area-weighted average approach to aggregate demographic and income variables from the census tract to the reporting district level. The initial variables came from two ACS datasets reported at the census tract level:

American Community Survey (ACS) Table B03002 (5-year estimates for race and ethnicity)
American Community Survey (ACS) Table S1901 (5-year estimates of household income)

Vintages for 2017, 2018, 2019, 2020, 2021, and 2022 were used for both ACS datasets. See data dictionary for more details.

We performed a spatial overlay using the Geopandas library to identify and calculate the intersecting areas between census tracts and reporting districts. A look-up table of the census tracts to reporting districts is available for future merges: CT_to_RD_lookup.csv. A visual depiction of the intersection is also produced: CT_to_RD_merge.png.

We calculated weighted averages for each year for each variable based on the area of each census tract within the reporting district. The (area-weighted averaged) variables for each reporting district include:

Median Household Income (Dollars)
Percentage of White Population
Percentage of Black/African American Population
Percentage of American Indian/Alaska Native Population
Percentage of Asian Population
Percentage of Native Hawaiian/Other Pacific Islander Population
Percentage of Some Other Race Population
Percentage of Two or More Races Population
Percentage of Hispanic/Latino Population

These variables are output into the file community_context_variables.csv.

Arrest and calls for service data were already available at the reporting district level and did not need to undergo the spatial overlay process. These data were captured into a separate file – lapd_demand_variables.csv.

This was for two reasons:

It's a lengthy file!
To capture the data at the finest temporal resolution (dates) since merging with the community context variables would require rolling up to the year level.

The data can be matched on Reporting_District_ID since that column name is consistent across both files.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
output		output
.gitignore		.gitignore
LAPD_Data_Dictionary.xlsx		LAPD_Data_Dictionary.xlsx
LICENSE		LICENSE
README.md		README.md
spatial_merge.ipynb		spatial_merge.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spatial Data Merge

Getting started

Running analysis

Methods and use

About

Uh oh!

Releases

Packages

Languages

License

EverydayRespect/spatial-data-merge

Folders and files

Latest commit

History

Repository files navigation

Spatial Data Merge

Getting started

Running analysis

Methods and use

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages