Skip to content

Commit 17d4097

Browse files
Merge pull request #795 from stan-dev/doc-fixes
Doc fixes
2 parents 70cf673 + 285133d commit 17d4097

File tree

6 files changed

+57
-36
lines changed

6 files changed

+57
-36
lines changed

src/bibtex/all.bib

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1825,3 +1825,23 @@ @article{Riutort-Mayol:2023:HSGP
18251825
pages={17},
18261826
year={2023}
18271827
}
1828+
1829+
@article{Vehtari+etal:2021:Rhat,
1830+
title={Rank-normalization, folding, and localization: An improved $\widehat{R}$ for assessing convergence of {MCMC}},
1831+
author={Vehtari, Aki and Gelman, Andrew and Simpson, Daniel and Carpenter, Bob and B{\"u}rkner, Paul-Christian},
1832+
journal={Bayesian Analysis},
1833+
year=2021,
1834+
volume=16,
1835+
pages={667--718}
1836+
}
1837+
1838+
@article{Timonen+etal:2023:ODE-PSIS,
1839+
title={An importance sampling approach for reliable and efficient inference in {Bayesian} ordinary differential equation models},
1840+
author={Timonen, Juho and Siccha, Nikolas and Bales, Ben and L{\"a}hdesm{\"a}ki, Harri and Vehtari, Aki},
1841+
journal={Stat},
1842+
year={2023},
1843+
volume = 12,
1844+
number = 1,
1845+
pages = {e614}
1846+
}
1847+

src/reference-manual/analysis.qmd

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -298,7 +298,7 @@ and can apply the standard tests.
298298

299299
The second technical difficulty posed by MCMC methods is that the
300300
samples will typically be autocorrelated (or anticorrelated) within a
301-
chain. This increases the uncertainty of the estimation of posterior
301+
chain. This increases (or reduces) the uncertainty of the estimation of posterior
302302
quantities of interest, such as means, variances, or quantiles; see
303303
@Geyer:2011.
304304

@@ -309,19 +309,19 @@ central limit theorem (CLT).
309309

310310
Unlike most packages, the particular calculations used by Stan follow
311311
those for split-$\hat{R}$, which involve both cross-chain (mean) and
312-
within-chain calculations (autocorrelation); see @GelmanEtAl:2013.
312+
within-chain calculations (autocorrelation); see @GelmanEtAl:2013 and
313+
@Vehtari+etal:2021:Rhat.
313314

314315

315316
### Definition of effective sample size {-}
316317

317318
The amount by which autocorrelation within the chains increases
318319
uncertainty in estimates can be measured by effective sample size (ESS).
319-
Given independent samples, the central limit theorem
320-
bounds uncertainty in estimates based on the number of samples $N$.
321-
Given dependent samples, the number of independent samples is replaced
322-
with the effective sample size $N_{\mathrm{eff}}$, which is
323-
the number of independent samples with the same estimation power as
324-
the $N$ autocorrelated samples. For example, estimation error is
320+
Given independent sample (with finite variance), the central limit theorem
321+
bounds uncertainty in estimates based on the sample size $N$.
322+
Given dependent sample, the sample size is replaced
323+
with the effective sample size $N_{\mathrm{eff}}$.
324+
For example, Monte Carlo standard error (MCSE) is
325325
proportional to $1 / \sqrt{N_{\mathrm{eff}}}$ rather than
326326
$1/\sqrt{N}$.
327327

@@ -364,16 +364,15 @@ $$
364364

365365

366366
For independent draws, the effective sample size is just the number of
367-
iterations. For correlated draws, the effective sample size will be
368-
lower than the number of iterations. For anticorrelated draws, the
367+
iterations. For correlated draws, the effective sample size is usually
368+
lower than the number of iterations, but in case of anticorrelated draws, the
369369
effective sample size can be larger than the number of iterations. In
370370
this latter case, MCMC can work better than independent sampling for
371371
some estimation problems. Hamiltonian Monte Carlo, including the
372372
no-U-turn sampler used by default in Stan, can produce anticorrelated
373373
draws if the posterior is close to Gaussian with little posterior
374374
correlation.
375375

376-
377376
### Estimation of effective sample size {-}
378377

379378
In practice, the probability function in question cannot be tractably
@@ -493,8 +492,8 @@ second approach with thinning can produce a higher effective sample
493492
size when the draws are positively correlated. That's because the
494493
autocorrelation $\rho_t$ for the thinned sequence is equivalent to
495494
$\rho_{10t}$ in the unthinned sequence, so the sum of the
496-
autocorrelations will be lower and thus the effective sample size
497-
higher.
495+
autocorrelations usually will be lower and thus the effective sample size
496+
higher.
498497

499498
Now contrast the second approach above with the unthinned alternative,
500499

@@ -506,4 +505,4 @@ large. To summarize, *the only reason to thin a sample is to reduce
506505
memory requirements*.
507506

508507
If draws are anticorrelated, then thinning will increase correlation
509-
and reduce the overall effective sample size.
508+
and further reduce the overall effective sample size.

src/reference-manual/types.qmd

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -837,10 +837,10 @@ definite. Like correlation matrices, covariance matrices only need a
837837
single dimension in their declaration. For instance,
838838

839839
```stan
840-
cov_matrix[K] Omega;
840+
cov_matrix[K] Sigma;
841841
```
842842

843-
declares `Omega` to be a $K \times K$ covariance matrix, where
843+
declares `Sigma` to be a $K \times K$ covariance matrix, where
844844
$K$ is the value of the data variable `K`.
845845

846846

@@ -853,10 +853,10 @@ Because correlation matrices are square, only one dimension needs
853853
to be declared. For example,
854854

855855
```stan
856-
corr_matrix[3] Sigma;
856+
corr_matrix[3] Omega;
857857
```
858858

859-
declares `Sigma` to be a $3 \times 3$ correlation matrix.
859+
declares `Omega` to be a $3 \times 3$ correlation matrix.
860860

861861
Correlation matrices may be assigned to other matrices, including
862862
unconstrained matrices, if their dimensions match, and vice-versa.

src/stan-users-guide/algebraic-equations.qmd

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ vector for parameters if the system does not involve data or parameters.
8484
Let's suppose $\theta = (3, 6)$. To call the algebraic solver, we need to
8585
provide an initial guess. This varies on a case-by-case basis, but in general
8686
a good guess will speed up the solver and, in pathological cases, even determine
87-
whether the solver converges or not. If the solver does not converge, the metropolis
87+
whether the solver converges or not. If the solver does not converge, the Metropolis
8888
proposal gets rejected and a warning message, stating no acceptable solution was
8989
found, is issued.
9090

@@ -107,7 +107,7 @@ transformed parameters {
107107
vector[2] theta = [3, 6]';
108108
vector[2] y;
109109
110-
y = algebra_solver_newton(system, y_guess, theta, x_r, x_i);
110+
y = solve_newton(system, y_guess, theta, x_r, x_i);
111111
}
112112
```
113113

@@ -137,24 +137,26 @@ For instance, it might make "physical sense" for a solution to be positive or ne
137137

138138
On the other hand, a system may not have a solution (for a given point in the parameter
139139
space). In that case, the solver will not converge to a solution. When the solver fails to
140-
do so, the current metropolis proposal gets rejected.
140+
do so, the current Metropolis proposal gets rejected.
141141

142142
## Control parameters for the algebraic solver {#algebra-control.section}
143143

144-
The call to the algebraic solver shown previously uses the default control settings. The solver
145-
allows three additional parameters, all of which must be supplied if any of them is
146-
supplied.
144+
The call to the algebraic solver shown previously uses the default control settings. The `_tol` variant of the solver function
145+
allows three additional parameters, all of which must be supplied.
147146

148147
```stan
149-
y = algebra_solver_newton(system, y_guess, theta, x_r, x_i,
150-
rel_tol, f_tol, max_steps);
148+
y = solve_newton_tol(system, y_guess, theta, x_r, x_i,
149+
scaling_step, f_tol, max_steps);
151150
```
152151

153-
The three control arguments are relative tolerance, function tolerance, and maximum
154-
number of steps. Both tolerances need to be satisfied. If one of them is not met, the
155-
metropolis proposal gets rejected with a warning message explaining which criterion
156-
was not satisfied. The default values for the control arguments are respectively
157-
`rel_tol = 1e-10` ($10^{-10}$), `f_tol = 1e-6` ($10^{-6}$), and `max_steps = 1e3` ($10^3$).
152+
For the Newton solver the three control arguments are scaling step, function tolerance, and maximum number of steps. For the Powell's hybrid method the three control arguments are relative tolerance, function tolerance, and maximum number of steps. If a Newton step is smaller than the scaling step tolerance, the code breaks, assuming the solver is no longer making significant progress. If set to 0, this constraint is ignored. For Powell's hybrid method the relative tolerance is the estimated relative error of the solver and serves to test if a satisfactory solution has been found. After convergence of the either solver, the proposed solution
153+
is plugged into the algebraic system and its norm is compared to the function tolerance. If the norm is below the function tolerance, the solution is deemed acceptable. If the solver solver reaches the maximum number of steps, it stops and returns an error message. If one of the criteria is not met, the
154+
Metropolis proposal gets rejected with a warning message explaining which criterion
155+
was not satisfied.
156+
157+
158+
The default values for the control arguments are respectively
159+
`scaling_step = 1e-3` ($10^{-3}$), `rel_tol = 1e-10` ($10^{-10}$), `f_tol = 1e-6` ($10^{-6}$), and `max_steps = 200` ($200$).
158160

159161
### Tolerance {-}
160162

@@ -172,12 +174,12 @@ Smaller relative tolerances produce more accurate solutions but require more com
172174
#### Sensitivity analysis {-}
173175

174176
The tolerances should be set low enough that setting them lower does not change the
175-
statistical properties of posterior samples generated by the Stan program.
177+
statistical properties of posterior samples generated by the Stan program. The sensitivity can be analysed using importance sampling without need to re-run MCMC with different tolerances as shown by @Timonen+etal:2023:ODE-PSIS.
176178

177179
### Maximum number of steps {-}
178180

179181
The maximum number of steps can be used to stop a runaway simulation. This can arise in
180182
MCMC when a bad jump is taken, particularly during warmup. If the limit is hit, the
181-
current metropolis proposal gets rejected. Users will see a warning message stating the
183+
current Metropolis proposal gets rejected. Users will see a warning message stating the
182184
maximum number of steps has been exceeded.
183185

src/stan-users-guide/decision-analysis.qmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -186,8 +186,8 @@ model {
186186
generated quantities {
187187
array[4] real util;
188188
for (k in 1:4) {
189-
util[k] = U(lognormal_rng(mu[k], sigma[k]),
190-
lognormal_rng(nu[k], tau[k]));
189+
util[k] = U(lognormal_rng(nu[k], tau[k]),
190+
lognormal_rng(mu[k], sigma[k]));
191191
}
192192
}
193193
```

src/stan-users-guide/gaussian-processes.qmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -555,7 +555,7 @@ covariance matrix called `L_cov_exp_quad_ARD`.
555555

556556
```stan
557557
functions {
558-
matrix L_cov_exp_quad_ARD(vector[] x,
558+
matrix L_cov_exp_quad_ARD(array[] vector x,
559559
real alpha,
560560
vector rho,
561561
real delta) {

0 commit comments

Comments
 (0)