California PUMS Econometric Study

This repository contains R code and resources for an econometric analysis of the California Public Use Microdata Sample (PUMS) dataset. The primary focus is to study the determinants of wage and salary income using demographic, educational, and occupational variables.

Requirements

R (version 4.0 or higher recommended)

Required R Packages

tidyverse
ggplot2
languageserver (recommended for code completion and editor support)

Install all required packages in R with:

install.packages(c("tidyverse", "ggplot2", "languageserver"))

Usage

The California State Data Set.csv file is included in this repository.
Open data_analysis.R in RStudio or your preferred R environment.
Run the script. It will:
- Load and inspect the data
- Summarize missing values
- Engineer features for analysis (e.g., citizenship, occupation, education)
- Filter and clean the data
- Fit multiple linear regression models to study wage determinants
- Output model summaries and correlation matrices

Analysis Overview

The script performs the following steps:

Data Cleaning: Handles missing values and filters relevant observations.
Feature Engineering: Creates binary/categorical variables for citizenship, occupation, English proficiency, education, etc.
Modeling: Fits several linear regression models to estimate the impact of demographic and socioeconomic factors on wage income, including interaction effects.
Interpretation: Provides summary statistics and model outputs for further interpretation.

Notes

The dataset (California State Data Set.csv) is now included for convenience. Please ensure you have the right to use and share this data if redistributing the repository.
The script is well-commented for clarity and reproducibility.
For questions or contributions, please open an issue or submit a pull request.

License

This project is provided for academic and research purposes. Please cite appropriately if used in publications.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
California State Data Set.csv		California State Data Set.csv
README.md		README.md
data_analysis.R		data_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

California PUMS Econometric Study

Contents

Requirements

Required R Packages

Usage

Analysis Overview

Notes

License

About

Uh oh!

Releases

Packages

Languages

toofanCodes/CaliWageAnalytics

Folders and files

Latest commit

History

Repository files navigation

California PUMS Econometric Study

Contents

Requirements

Required R Packages

Usage

Analysis Overview

Notes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages