updated CmdStan and Ref-manual for rank-normalized split R-hat and ESS

mitzimorris · mitzimorris · commit c127edc94e87 · 2024-11-21T21:44:33.000-05:00
diff --git a/src/bibtex/all.bib b/src/bibtex/all.bib
@@ -1867,6 +1867,7 @@ @article{Magnusson+etal:2024:posteriordb
   author={Magnusson, M{\aa}ns and Torgander, Jakob and B{\"u}rkner, Paul-Christian and Zhang, Lu and Carpenter, Bob and Vehtari, Aki},
   journal={arXiv preprint arXiv:2407.04967},
   year={2024}
+}
 
 @article{egozcue+etal:2003,
   title={Isometric logratio transformations for compositional data analysis},
diff --git a/src/cmdstan-guide/stansummary.qmd b/src/cmdstan-guide/stansummary.qmd
@@ -12,16 +12,17 @@ diagnostic statistics on the sampler chains, reported in the following order:
 
 - Mean - sample mean
 - MCSE - Monte Carlo Standard Error, a measure of the amount of noise in the sample
-- StdDev - sample standard deviation
+- StdDev - sample standard deviation - the variance around the sample mean.
+- MAD - Median Absolute Deviation  - the variance around the sample median.
 - Quantiles - default 5%, 50%, 95%
-- N_eff - effective sample size - the number of independent draws in the sample
-- N_eff/S - the number of independent draws per second
+- ESS_bulk
+- ESS_tail
 - R_hat - $\hat{R}$ statistic, a measure of chain equilibrium, must be within $0.05$ of  $1.0$.
 
 When reviewing the `stansummary` output, it is important to check the final three
 output columns first - these are the diagnostic statistics on chain convergence and
 number of independent draws in the sample.
-A $\hat{R}$ statistic of greater than $1.05$ indicates that the chain has not converged and
+A $\hat{R}$ statistic of greater than $1.01$ indicates that the chain has not converged and
 therefore the sample is not drawn from the posterior, thus the estimates of the mean and
 all other summary statistics are invalid.
 
@@ -34,12 +35,16 @@ For more information, see the
 [Posterior Analysis](https://mc-stan.org/docs/reference-manual/analysis.html)
 chapter of the Stan Reference Manual which describes both the theory and practice of MCMC
 estimation techniques.
+
+The statistics - Mean, StdDev, MAD, and Quantiles - are computed directly from all draws across all chains.
+The diagnostic statistics - MCSE, ESS_bulk, ESS_tail, and R_hat are computed from the rank-normalized,
+folded chains according to the definitions in @Vehtari+etal:2021:Rhat.
 The summary statistics and the algorithms used to compute them are described in sections
 [Notation for samples](https://mc-stan.org/docs/reference-manual/analysis.html#notation-for-samples-chains-and-draws)
 and
 [Effective Sample Size](https://mc-stan.org/docs/reference-manual/analysis.html#effective-sample-size.section).
 
-## Building the stansummary command
+## Building the `stansummary` command
 
 The CmdStan makefile task `build` compiles the `stansummary` utility
 into the `bin` directory.
diff --git a/src/reference-manual/analysis.qmd b/src/reference-manual/analysis.qmd
@@ -190,8 +190,27 @@ because the first half of each chain has not mixed with the second
 half.
 
 
-### Convergence is global {-}
+### Rank-normalization helps when there are heavy tails {-}
+
+Split R-hat and the effective sample size (ESS) are well defined only if
+the marginal posteriors have finite mean and variance.
+Therefore, following @Vehtari+etal:2021:Rhat, we compute the rank-normalized
+paramter values and then feed them into the formulas for split R-hat and ESS.
+
+Rank normalization proceeds as follows:
+
+* First, replace each value $\theta^{(nm)}$ by its rank $r^{(nm)}$ within the pooled
+draws from all chains. Average rank for ties are used to conserve
+the number of unique values of discrete quantities.
+
+* Second, transform ranks to normal scores using the inverse normal transformation
+and a fractional offset:
 
+$$
+z_{(nm)} = \Phi^{-1} \left( \frac{r_{(nm)} - 3/8}{S - 1/4} \right)
+$$
+
+### Convergence is global {-}
 A question that often arises is whether it is acceptable to monitor
 convergence of only a subset of the parameters or generated
 quantities.  The short answer is "no," but this is elaborated