Skip to content

felix-reichel/BarianceVariance_Reproduction_Repo_Robustness_Supplements

Repository files navigation

Bariance and Variance Estimators: Reproduction Repository & Robustness Supplements // Speeding UP Your Sample Variance in Big-Data Environments!

Deterministic Reproduction of Results from https://doi.org/10.48550/arXiv.2503.22333

This repository contains a Python 3.11 script (executed in a virtualized cloud environment) for deterministic simulation of a-denominator based sample variance estimators. It computes Bias², Variance, and Mean Squared Error (MSE), alongside bootstrapped standard errors (SEs), as a robustness check in support of Ch. 7 (Simulation Study on MSE across Denominator Values) in v5.
Additionally, empirical runtime plots assess the performance of the Bariance estimator under varied data conditions, such as large-scale gamma-distributed samples, in a local Java SE 21 environment—supporting Ch. 9 (Empirical Runtime Analysis) and Appendix C (Java Runtime Robustness Checks) of v5.


Code

  • MSEVarEstimsSimulationsWithBootstrapedSEs.py
    Main simulation script. Evaluates multiple sample variance estimators using a fixed-seed normal distribution N(0,1). Calculates Bias², Variance, MSE, and bootstrapped confidence intervals. Python 3.11.

variance_estimators_real_data_ci_plot


Runtime Figures

Local Execution Environment:

  • Java Runtime: SE 21
  • OS: macOS 13.0
  • Architecture: aarch64
  • Cores: 10 (single-threaded benchmark)
  • Max JVM Memory: 4096 MB

These runtime plots replicate and extend the analysis in Ch. 9 and Appendix C (Robustness Checks) of v5:

ATT_Gamma_1k_trials

coef_plot_runtime_1k_trials_iqr_removed
coef_plot_runtime_100k_gamma_trials
coef_plot_runtime_100k_gamma_trialsv2
empRuntimeNormalUnseeded1_100trials
empRuntimeNormalUnseeded2_100trials
empRuntimeTestBigNormalUnseeded_100trials
empRuntimeTestSmallNormalUnseeded_100trials
FAST_5_GAMMA_DIST_GUCCI_Plot
FAST_5_GAMMA_Dist
gamma_1ktrials_n10k_runtime_density
Garbage_BigEightEstimatorsPlot
logLinearModelRegressions
LogLinearModelRuntimeRegressions
runtime_density_gamma_1k_grid_display__BEST_SO_FAR
runtime_density_gamma_100ktrials_grid
runtime_density_gamma_100ktrials_largest
runtime_density_gamma_highres_Seeded_with_mean_100trials


Requirements

🐍 Python 3.11+

Install dependencies via:

pip install numpy matplotlib seaborn scipy

♨️ Java 21+

SE 21

About

Replication Package for On Bessel's Correction: Unbiased Sample Variance, the Bariance, and a Novel Runtime-Optimized Estimator, DOI: https://doi.org/10.48550/arXiv.2503.22333

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published