There exists substantial debate about the role that spending on education impacts the ability of student to succeed. While some argue that increased spending is an important way to improve educational outcomes, others argue that spending is not an effective tool, and that educational institutions should pursue other strategies to help their students. The topic of spending in education has become a significant component of the public policy debate regarding education policy, and must be addressed. In order to understand the potential for increased spending to improve performance, we must assess school performance on the basis of spending, and see if increased spending leads to better student outcomes. This will allow us to identify whether or not increasing spending is a worthy public policy goal.
The dataset used for this research project comes from this resource, a Kaggle dataset compiled from data provided by the National Center for Education Statistics, the US Census Bureau, and the NAEP National Report card. The dataset provides information regarding spending, student enrollment, and student test scores in reading and math for all US states from 1992 to 2015. In total, the dataset contains 1,715 observations across 25 fields, for a total of 42,825 data points. Firstly, I dropped observations with missing values, which removed all observations prior to 2003, as these observations had some missing data points. I then generated spending per student datafields to allow for comparisons across states. This yielded the dataset that I used for the hypothesis testing and linear regression analysis. Summary statistics and data visualizations for this dataset are shown in the figures folder. For images showing the distribution of student performance and spending per student, see figures 1-6.
To evaluate whether or not increased spending improves student test performance, I began by asking a simpler question: do schools with higher spending per student outperform schools with lower spending per student? To answer this question, I split the dataset into two subgroups: observations with spending levels below the mean, and observations with spending levels above the mean, then tested the hypothesis that those above the mean have better test outcomes across the following four categories: 4th Grade reading scores, 4th Grade math scores, 8th Grade reading scores, and 8th Grade math scores. In each of these tests, the higher-spending schools had better test scores than lower spending scores, with p values well below the threshold of 0.05.
To move on to the more complicated question of what role spending plays in determining student outcomes, I ran several linear regressions, using spending as an independent variable and test scores as a dependent variable. Each regression shows a very modest positive coefficient, but a low R^2 (close to 0), indicating that the relationship between spending and student performance is weak. For more details regarding these regressions, see figures 7-10.
Several conclusions can be derived from these analyses:
- High-spending schools outperform low-spending schools
- The relationship between spending and performance follows a concave parabola, meaning that moderately high spending schools outperform both low spending and very high spending schools
- School spending accounts for very little of the variance in student outcomes
- Because student outcomes tend to have only a weak relationship with spending, increased spending is unlikely to be the optimal policy for improving student performance
For all the details regarding the code behind this project, see the script.r file, and for the data visualizations referenced in this summary, see the figures folder.
- Fig1: Histogram of spending per student
- Fig2: Box and Whisker Plot of student test performance across 4th-8th grade, Math-Reading categories
- Fig3: 4th Grade reading histogram
- Fig4: 4th Grade math histogram
- Fig5: 8th Grade math histogram
- Fig6: 8th Grade reading histogram
- Fig7: Linear regression/scatterplot 4th Grade math
- Fig8: Linear regression/scatterplot 4th Grade reading
- Fig9: Linear regression/scatterplot 8th grade math
- Fig10: Linear regression/scatterplot 8th grade reading