A statistical data science case study using R to explore environmental trends.
This project investigates the relationship between temperature and groundwater level (GWL) in Riau Province, Indonesia, using Simple Linear Regression in R. It serves as a statistical case study and environmental analysis that intersects climate science and data analytics.
- Climate and hydrological data from Riau Province (2019), sourced from open-access academic datasets
- Variables:
Suhu
(temperature),GWL
(groundwater level)
To evaluate whether variations in temperature significantly impact groundwater levels in Riau through regression modeling, correlation analysis, and hypothesis testing.
└️ analisis-gwl-riau/
├─ data.csv # Dataset with temperature and GWL
├─ R.Rmd # R Markdown for reproducible analysis
└─ README.md # Project documentation
- R & RStudio
ggplot2
– Visualizationlmtest
– Regression diagnostics (e.g., Breusch-Pagan Test)stats
– Linear regression model
- Pearson correlation coefficient: r = 0.035 → very weak correlation
- Regression model:
GWL = 29.45 + 1.11 * Suhu
- R-squared: very low → temperature is not a strong explanatory variable
- p-value < 0.05 → model statistically significant despite weak fit
Scatter Plot & Regression Line | Residual Diagnostic |
---|---|
![]() |
![]() |
# Load libraries
library(ggplot2)
library(lmtest)
# Import data
data <- read.csv("data.csv")
# Regression model
model <- lm(GWL ~ Suhu, data = data)
summary(model)
# Correlation
cor(data$Suhu, data$GWL)
# Plot
ggplot(data, aes(x = Suhu, y = GWL)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red")
# R-squared
summary(model)$r.squared
- Statistical Modeling (Simple Linear Regression)
- Correlation Analysis & R-squared Interpretation
- Exploratory Data Analysis (EDA)
- Residual Diagnostics (Breusch-Pagan)
- Reproducible Research using R Markdown
MIT License
© 2025 Julio-analyst