This repository contains the Nebraskan subset from the 2023 Behavioral Risk Factor Surveillance System (BRFSS). The BRFSS is run annually by the Center for Disease control and prevention (CDC).
You can load the data with
brfss <- readRDS("brfssNE2023.rds")
The dataset contains 12886 observations on 350 variables.
The codebook for the full data is available from
https://www.cdc.gov/brfss/annual_data/annual_2023.html. A copy of the
site is included in the repo in the file USCODE23_LLCP_021924.HTML
.
Variables names starting with X_
in the R data are listed in the
codebook without the starting X
, i.e. the variable X_AGEG5YR
can be
found as _AGEG5YR
in the codebook.
- Clone this repository to your local machine and open RStudio using
the file
reproducibility-brfss.Rproj
. - Create a quarto document
index.qmd
. This is the file in which you should include all of your work. - Use the
brfss
data described above to find an age distribution of all Nebraskans who participated in the BRFSS survey. Use the variableX_AGEG5YR
and show the distribution in a barchart and a table. Make sure to address in a paragraph how you deal with non responses. - Is the age of Nebraskans distributed significantly different from
the nationally reported age distribution? (You could run a
Chi-square test of homogeneity using
chisq.test
) Make sure to interpret the results. - Re-do the analysis in questions 2 and 3 by considering the survey
weights
X_LLCPWT
. Again, interpret the results. How do you explain the differences?
- Make sure that your file
index.qmd
contains all the details needed for me to re-run your analysis. - Ensure that the file renders without an error.
Add your file index.qmd
to the repository, commit and push.