Bariance and Variance Estimators: Reproduction Repository & Robustness Supplements // Speeding UP Your Sample Variance in Big-Data Environments!
Deterministic Reproduction of Results from https://doi.org/10.48550/arXiv.2503.22333
This repository contains a Python 3.11 script (executed in a virtualized cloud environment) for deterministic simulation of a-denominator based sample variance estimators. It computes Bias², Variance, and Mean Squared Error (MSE), alongside bootstrapped standard errors (SEs), as a robustness check in support of Ch. 7 (Simulation Study on MSE across Denominator Values) in v5.
Additionally, empirical runtime plots assess the performance of the Bariance estimator under varied data conditions, such as large-scale gamma-distributed samples, in a local Java SE 21 environment—supporting Ch. 9 (Empirical Runtime Analysis) and Appendix C (Java Runtime Robustness Checks) of v5.
Code
MSEVarEstimsSimulationsWithBootstrapedSEs.py
Main simulation script. Evaluates multiple sample variance estimators using a fixed-seed normal distribution N(0,1). Calculates Bias², Variance, MSE, and bootstrapped confidence intervals. Python 3.11.
Runtime Figures
- Java Runtime: SE 21
- OS: macOS 13.0
- Architecture: aarch64
- Cores: 10 (single-threaded benchmark)
- Max JVM Memory: 4096 MB
These runtime plots replicate and extend the analysis in Ch. 9 and Appendix C (Robustness Checks) of v5:
Requirements
Install dependencies via:
pip install numpy matplotlib seaborn scipy
SE 21