Bayesian Stochastic Frontier Models (SN-HN) with PyMC

This repository provides an implementation of Bayesian stochastic frontier models under the Skew-Normal – Half-Normal (SN-HN) error structure, as described in:

Wei et al. (2025)
"Bayesian stochastic frontier models under the skew-normal half-normal settings"
Journal of Productivity Analysis
DOI: 10.1007/s11123-025-00757-3

📌 Overview

Our contributions are threefold:

replicate the SN-HN stochastic frontier model in PyMC, providing an accessible and transparent implementation for Python users
apply the model to simulated production data generated from a SN-HN distribution and evaluate the model’s performance using posterior analysis
compare the performance of proposed by (Wei et al., 2025) which employs a skew-normal likelihood with that of a conventional model based on a normal likelihood to assess the impact of distributional asymmetry on posterior inference.

🧠 Model Structure

The stochastic frontier model is specified as:

$$ Y_i = \alpha + \beta X + V - U $$

$Y$ represents the logarithm of the output variable
$X$ denotes the logarithm of a single explanatory variable
$\alpha$ is the intercept, $β$ is the regression coefficient
$U$ represents the inefficiency error term
$V$ is the measurement error term
$\xi$ is the location parameter

$$ X \sim N(1,1)\\ U \sim HN(0,\sigma_u^2)\\ Y \sim SN(\xi,\sigma_v^2,\lambda)\\ \xi = \alpha + \beta X - U $$

The true parameter values used in simulation are: $\alpha= 5, β = 2, σ_u = 1, σ_v = 1.5$

Prior Distributions

For the Bayesian analysis, we adopt the following prior distributions:

$$ \alpha \sim N(\mu_\alpha, \sigma_\alpha^2)\\ \beta \sim N(\mu_\beta, \sigma_\beta^2)\\ \sigma_v^2 \sim IG(\alpha_v,\beta_v)\\ \sigma_u^2 \sim IG(\alpha_u,\beta_u)\\ \lambda = \sim TN(\mu_\lambda,\sigma_\lambda^2;a,b)$$

To assess the robustness of posterior inference for λ, we simulate data across the following scenarios:

Skewness values: λ∈{−0.5,−1,−1.5,−2,−5}
Sample sizes: n∈{50,100,200,500}

This results in a total of 20 simulation scenarios. For each scenario, nsynthetic observations are generated and used as input for Bayesian inference.

📁 Repository Structure

.
├── Notebooks/              # Per-lambda simulation runs (0.5 to 5)
├── Figures/                # Posterior TE plots (uploaded separately)
├── Output/                # All result outputs from the experiments, including:
│   ├── 1.posterior summaries     # Posterior parameter summaries (mean, sd, etc.)
│   ├── 2.traceplots              # Traceplots and posterior distributions
│   ├── 3.loo                     # Leave-One-Out cross-validation results
│   ├── 4.rmse_param              # RMSE for estimated model parameters
│   ├── 5.y_pred_rmse            # RMSE for predicted y values
│   ├── 6.inefficiency_rmse      # RMSE for inefficiency term (u)
├── Docs/                  # Project report (PDF) summarizing the Bayesian stochastic frontier modeling approach, results, and key findings
├── LICENSE                    # MIT License
├── requirements.txt        # Python environment
└── README.md               # This file

🔬 Experiments

We evaluate model behavior under five levels of skewness:

λ	Notebook Filename	Description
−0.5	`Final-lam-0.5_simplified.ipynb`	Mild skew
−1.0	`Final-lam-1_simplified.ipynb`	Moderate skew
−1.5	`Final-lam-1.5_simplified.ipynb`	Strong skew
−2.0	`Final-lam-2_simplified.ipynb`	Heavy skew
−5.0	`Final-lam-5_simplified.ipynb`	Extreme skew (hard case)

Each notebook includes:

Posterior traceplots
Inference results for $\alpha, \beta, \sigma_v, \sigma_u, \lambda$
RMSE comparisons for Skew-Normal vs Normal
Posterior summaries of all parameters
MCMC convergence diagnostics

📈 Trace plot results

Trace plot of parameter posteriors across different skewness levels (λ):

λ	Posterior Traceplot
−0.5
−1.0
−1.5
−2.0
−5.0

These plots demonstrate how skewness affects the TE posterior. As λ becomes more negative, the posterior spreads wider—highlighting increased uncertainty and the importance of using a skew-normal likelihood.

📊 RMSE Table for Inefficiency Term (u)

✅ Key Findings

Wei et al’s Bayesian linear model with a skew-normal half-normal likelihood performs comparably to the standard normal model across LOO and RMSE metrics.
For λ far from zero (e.g., −1.5, −2, −5), the Wei et al. model predicts parameter values and inefficiency more accurately using the posterior mean prediction.
This is shown by lower RMSE in posterior mean inefficiency and parameter estimates compared to the model using the normal likelihood.

💻 Setup

To install dependencies:

pip install -r requirements.txt

Run simulations:

# Example: run simulation with λ = -1.5
jupyter notebook notebooks/Final-lam-1.5_simplified.ipynb

📄 Citation

If you use this code, please cite:

@article{wei2025bayesian,
  title={Bayesian stochastic frontier models under the skew-normal half-normal settings},
  author={Wei, Zheng and Choy, S.T. Boris and Wang, Tonghui and Zhu, Xiaonan},
  journal={Journal of Productivity Analysis},
  year={2025},
  doi={10.1007/s11123-025-00757-3}
}

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

📬 Contact

For academic questions:
📧 zheng.wei@tamucc.edu 📧 zzapata2@tamucc.edu 📧 cliu7@tamucc.edu 📧 yhwang@tamucc.edu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bayesian Stochastic Frontier Models (SN-HN) with PyMC

📌 Overview

🧠 Model Structure

Prior Distributions

📁 Repository Structure

🔬 Experiments

📈 Trace plot results

📊 RMSE Table for Inefficiency Term (u)

✅ Key Findings

💻 Setup

📄 Citation

📄 License

📬 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Docs		Docs
Figures		Figures
Notebooks		Notebooks
Output		Output
5345presentation_6_13.key		5345presentation_6_13.key
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

Yejin-Hwang/Bayesian-stochastic-frontier-models-under-the-skew-normal-half-normal-settings-with-PyMC

Folders and files

Latest commit

History

Repository files navigation

Bayesian Stochastic Frontier Models (SN-HN) with PyMC

📌 Overview

🧠 Model Structure

Prior Distributions

📁 Repository Structure

🔬 Experiments

📈 Trace plot results

📊 RMSE Table for Inefficiency Term (u)

✅ Key Findings

💻 Setup

📄 Citation

📄 License

📬 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages