Skip to content

Commit 90255c6

Browse files
committed
Merge branch 'master' of https://github.com/stan-dev/docs
2 parents e8a4c0f + 2467ab9 commit 90255c6

File tree

3 files changed

+104
-5
lines changed

3 files changed

+104
-5
lines changed

src/functions-reference/functions_index.qmd

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1038,6 +1038,11 @@ pagetitle: Alphabetical Index
10381038
- [`(T1 x, T2 y) : R`](real-valued_basic_functions.qmd#index-entry-d534a0845142902e1168c6b016843f301f8826d6)
10391039

10401040

1041+
**fatal_error**:
1042+
1043+
- [`(T1 x1,..., TN xN) : void`](void_functions.qmd#index-entry-36f600d7cd1daea1b94b57949bed5e7914fd9442)
1044+
1045+
10411046
**fdim**:
10421047

10431048
- [`(real x, real y) : real`](real-valued_basic_functions.qmd#index-entry-8e4c91cd9725a9e73f23c330bf345dbc813f4a44)

src/functions-reference/higher-order_functions.qmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -299,7 +299,7 @@ ODE solve function call.
299299

300300
The arguments to the ODE solvers in both the stiff and non-stiff solvers are the
301301
same. The arguments to the adjoint ODE solver are different; see
302-
[Arguments to the adjoint ODE solvers](#adjoint-sensitivity-solver).
302+
[Arguments to the adjoint ODE solver](#adjoint-sensitivity-solver).
303303

304304
* *`ode`*: ODE system function,
305305

@@ -333,7 +333,7 @@ or functions of parameters or transformed parameters.
333333

334334
The arguments to the adjoint ODE solver are different from those for
335335
the other functions (for those see
336-
[Arguments to the adjoint ODE solvers](#forward-sensitivity-solver)).
336+
[Arguments to the ODE solvers](#forward-sensitivity-solver)).
337337

338338
* *`ode`*: ODE system function,
339339

src/stan-users-guide/efficiency-tuning.qmd

Lines changed: 97 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1199,8 +1199,14 @@ should be faster.
11991199

12001200
In some cases, models can be recoded to exploit sufficient statistics
12011201
in estimation. This can lead to large efficiency gains compared to an
1202-
expanded model. For example, consider the following Bernoulli
1203-
sampling model.
1202+
expanded model. This section provides examples for Bernoulli
1203+
and normal distributions, but the same approach can be applied to
1204+
other members of the exponential family.
1205+
1206+
1207+
### Bernoulli sufficient statistics {-}
1208+
1209+
Consider the following Bernoulli sampling model.
12041210

12051211
```stan
12061212
data {
@@ -1257,6 +1263,94 @@ the PMF and simply amount to an alternative, more efficient coding of
12571263
the same likelihood. For efficiency, the frequencies `f[k]`
12581264
should be counted once in the transformed data block and stored.
12591265

1266+
The same trick works for combining multiple binomial observations.
1267+
1268+
1269+
### Normal sufficient statistics {-}
1270+
1271+
Consider the following Stan model for fitting a normal distribution to data.
1272+
1273+
```stan
1274+
data {
1275+
int N;
1276+
vector[N] y;
1277+
}
1278+
parameters {
1279+
real mu;
1280+
real<lower=0> sigma;
1281+
}
1282+
model {
1283+
y ~ normal(mu, sigma);
1284+
}
1285+
```
1286+
1287+
With the vectorized form used for `y`, Stan is clever enough to only
1288+
evaluate `log(sigma)` once, but it still has to evaluate the normal
1289+
for all of `y[1]` to `y[N]`, which involves adding up all the squared
1290+
differences from the mean and then dividing by `sigma` squared.
1291+
1292+
An equivalent density to the one above (up to normalizing constants
1293+
that do not depend on parameters), is given in the following Stan
1294+
program.
1295+
1296+
```stan
1297+
data {
1298+
int N;
1299+
vector[N] y;
1300+
}
1301+
transformed data {
1302+
real mean_y = mean(y);
1303+
real<lower=0> var_y = variance(y);
1304+
real nm1_over2 = 0.5 * (N - 1);
1305+
real sqrt_N = sqrt(N);
1306+
}
1307+
parameters {
1308+
real mu;
1309+
real<lower=0> sigma;
1310+
}
1311+
model {
1312+
mean_y ~ normal(mu, sigma / sqrt_N);
1313+
var_y ~ gamma(nm1_over2, nm1_over2 / sigma^2);
1314+
}
1315+
```
1316+
1317+
The data and parameters are the same in this program as in the first.
1318+
The second version adds a transformed data block to compute the mean
1319+
and variance of the data, which are the sufficient statistics here.
1320+
These are stored along with two other useful constants. Then the
1321+
program can define distributions over the mean and variance, both of
1322+
which are scalars here.
1323+
1324+
The original Stan program and this one define the same model in the
1325+
sense that they define the same log density up to a constant additive
1326+
term that does not depend on the parameters. The priors on `mu` and
1327+
`sigma` are both improper, but proper priors could be added as
1328+
additional statements in the model block without affecting the
1329+
sufficiency.
1330+
1331+
This transform explicitly relies on aggregating the data. Using this
1332+
trick on parameters leads to more computation than just computing the
1333+
normal log density, even before accounting for the non-linear change
1334+
of variables in the variance.
1335+
1336+
### Poisson sufficient statistics {-}
1337+
1338+
The Poisson distribution is the easiest case, because the sum of
1339+
observations is sufficient. Specifically, we can replace
1340+
1341+
```stan
1342+
y ~ poisson(lambda);
1343+
```
1344+
1345+
with
1346+
1347+
```stan
1348+
sum(y) ~ poisson(size(y) * lambda);
1349+
```
1350+
1351+
This will work even if `y` is a parameter vector because no Jacobian
1352+
adjustment is required for summation.
1353+
12601354

12611355
## Aggregating common subexpressions
12621356

@@ -1306,7 +1400,7 @@ unit sample variance has the following potential benefits:
13061400
* It aids in the interpretation and comparison of the importance of coefficients across different predictors.
13071401

13081402
When there are large differences between the units and scales of the predictors,
1309-
standardizating the predictors is especially useful.
1403+
standardizing the predictors is especially useful.
13101404
This section illustrates the principle for a simple linear regression.
13111405

13121406
Suppose that $y = (y_1,\dotsc,y_N)$ is a vector of $N$ outcomes and

0 commit comments

Comments
 (0)