Skip to content

Commit c7d6757

Browse files
committed
edits based on Bob's feedback
1 parent 6c35141 commit c7d6757

File tree

1 file changed

+72
-32
lines changed

1 file changed

+72
-32
lines changed

src/reference-manual/statements.qmd

Lines changed: 72 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -301,7 +301,7 @@ depend on the parameters. This is convenient because often the
301301
normalizing constant $Z$ is either time-consuming to compute or
302302
intractable to evaluate.
303303

304-
#### Built in distributions {-}
304+
#### Built in distributions {#built-in-distributions}
305305

306306
The built in distribution functions in Stan are all available in normalized
307307
and unnormalized form. The normalized forms include all of the terms in the log
@@ -318,11 +318,12 @@ $$
318318
-\frac{1}{2} \left( \frac{x - \mu}{\sigma} \right)^2
319319
$$
320320

321-
The `normal_lupdf` function returns the log density of an unnormalized distribution.
322-
With the unnormalized version of the function, Stan does not define what the
323-
normalization constant will be, though usually as many terms as possible are dropped
324-
to make the calculation fast. Dropping a constant `sigma` term, `normal_lupdf` would
325-
be equivalent to:
321+
The `normal_lupdf` function returns the log density of an unnormalized
322+
distribution. With the unnormalized version of the function, Stan
323+
does not define what the normalization constant will be, though
324+
usually as many terms as possible are dropped to make the calculation
325+
fast. Dropping a constant `sigma` term, `normal_lupdf` would be
326+
equivalent to:
326327

327328
$$
328329
\textsf{normal\_lupdf}(x | \mu, \sigma) =
@@ -376,25 +377,29 @@ y ~ normal(mu, sigma);
376377
mu ~ normal(0, 10);
377378
sigma ~ normal(0, 1);
378379
```
379-
The symbol $\sim$ is called tilde. Due to historical reasons, the distribution statements used to be called "sampling statements" in Stan, but that term is not recommended anymore as it is less accurate description.
380+
The symbol $\sim$ is called tilde. Due to historical reasons, the
381+
distribution statements used to be called "sampling statements" in
382+
Stan, but that term is not recommended anymore as it is less accurate
383+
description.
380384

381-
In general, we can read $\sim$ as "is distributed as," and overall this notation is used as a shorthand for defining distributions as
385+
In general, we can read $\sim$ as "is distributed as," and overall
386+
this notation is used as a shorthand for defining distributions, so
387+
that the above example can be written also as
382388
$$
383389
\begin{aligned}
384390
p(y| \mu, \sigma) & = \mathrm{normal}(y | \mu, \sigma)\\
385391
p(\mu) & = \mathrm{normal}(\mu | 0, 10)\\
386392
p(\sigma) & = \mathrm{normal}^+(\sigma | 0, 1).
387393
\end{aligned}
388394
$$
389-
A collection of distribution statements define an unnormalized joint distribution as the product of component distributions
395+
A collection of distribution statements define a joint
396+
distribution as the product of component distributions
390397
$$
391-
p(y,\mu,\sigma) \propto p(y| \mu, \sigma )p(\mu) p(\sigma).
398+
p(y,\mu,\sigma) = p(y| \mu, \sigma )p(\mu) p(\sigma).
392399
$$
393-
In general, the product of arbitrary probability density functions is not a normalized probability density function---that is, it will be positive but will not in general integrate to 1---but the proportionality is sufficient for the Stan algorithms.
394400

395-
Stan always constructs the target function---in Bayesian terms, the log posterior density function of the parameter vector---by adding terms in the model block. Equivalently, each $\sim$ statement corresponds to a multiplicative factor in the unnormalized posterior density.
396-
397-
This works even if the model is not constructed generatively. For example, suppose you include the following code in a Stan model:
401+
This works even if the model is not constructed generatively. For
402+
example, suppose you include the following code in a Stan model:
398403
```stan
399404
a ~ normal(0, 1);
400405
a ~ normal(0, 1);
@@ -403,23 +408,53 @@ This is translated to
403408
$$
404409
p(a) = \mathrm{normal}(a | 0, 1)\mathrm{normal}(a | 0, 1),
405410
$$
406-
which in this case is $\mathrm{normal}(a|0,1/\sqrt{2})$. One might expect that the above two lines of code would represent a redundant expression of a $\mathrm{normal}(a|0,1)$ prior, but, no, each line of code corresponds to an additional term in the target, or log posterior, density. You can think of each line as representing an additional piece of information.
407-
408-
Distribution statement `... ~ ...` accepts only distributions on the right side. These distributions can be built in or user defined distributions. The left side of a distribution statement may be data, parameter, or a complex expression, but the evaluated type needs to match one of the allowed type of the right hand side distribution (see more below).
409-
410-
In Stan, a distribution statement is merely a notational convenience following the typical
411-
notation used to present models in the literature. The above
412-
model defined with distribution statements could be expressed as a direct increment on the
413-
total log probability density as
411+
which in this case is $\mathrm{normal}(a|0,1/\sqrt{2})$. One might
412+
expect that the above two lines of code would represent a redundant
413+
expression of a $\mathrm{normal}(a|0,1)$ prior, but, no, each line of
414+
code corresponds to an additional term in the target, or log posterior
415+
density. You can think of each line as representing an additional
416+
piece of information.
417+
418+
When the joint distribution is considered as a function of parameters
419+
(e.g. $\mu$, $\sigma$) given fixed data, it is proportional to
420+
posterior distribution. In general, the posterior distribution is not
421+
a normalized probability density function---that is, it will be
422+
positive but will not in general integrate to 1---but the
423+
proportionality is sufficient for the Stan algorithms.
424+
425+
Stan always constructs the target function---in Bayesian terms, the
426+
log posterior density function of the parameter vector---by adding
427+
terms in the model block. Equivalently, each $\sim$ statement
428+
corresponds to a multiplicative factor in the unnormalized posterior
429+
density.
430+
431+
Distribution statement `... ~ ...` accepts only distributions on the
432+
right side. These distributions can be built in or user defined
433+
distributions. The left side of a distribution statement may be data,
434+
parameter, or a complex expression, but the evaluated type needs to
435+
match one of the allowed type of the right hand side distribution (see
436+
more below).
437+
438+
In Stan, a distribution statement is merely a notational convenience
439+
following the typical notation used to present models in the
440+
literature. The above model defined with distribution statements
441+
could be expressed as a direct increment on the total log probability
442+
density as
414443

415444
```stan
416445
target += normal_lpdf(y | mu, sigma);
417446
target += normal_lpdf(mu | 0, 10);
418447
target += normal_lpdf(sigma | 0, 1);
419448
```
420449

421-
Stan model can mix distribution statements and log probability increment
422-
statements. Although we often prefer to present models as joint distributions, there are several cases due to computational efficiency (e.g. censored data model) or Stan language limitations (e.g. mixture models), that we may want to define the log likelihood or parts of it directly, which is possible with log probability increment statements. See also below discussion about Jacobians.
450+
Stan models can mix distribution statements and log probability
451+
increment statements. Although in the literature statistical models
452+
are usually defined with distributions, there are several cases due to
453+
computational efficiency (e.g. censored data model) or coding language
454+
limitations (e.g. mixture models in Stan), that we may want to code
455+
the log likelihood or parts of it directly, which is possible with log
456+
probability increment statements. See the discussion below about
457+
Jacobians.
423458

424459
In general, a distribution statement of the form
425460

@@ -474,13 +509,16 @@ terms. Therefore, the explicit increment form can be used to recreate
474509
the exact log probability values for the model. Otherwise, the
475510
distribution statement form will be faster if any of the input expressions,
476511
`y`, `mu`, or `sigma`, involve only constants, data
477-
variables, and transformed data variables.
512+
variables, and transformed data variables. See the section
513+
[#built-in-distributions](Built in distributions) above discussing
514+
`_lupdf` and `_lupmf` functions that also drops all the constant terms.
478515

479516

480517
### User-transformed variables {-}
481518

482-
The left-hand side of a distribution statement may be a complex
483-
expression. For instance, it is legal syntactically to write
519+
The left-hand side of a distribution statement may be an arbitrary
520+
expression (of compatible type)". For instance, it is legal
521+
syntactically to write
484522

485523
```stan
486524
parameters {
@@ -661,7 +699,7 @@ $$
661699

662700
Stan allows probability functions to be truncated. For example, a
663701
truncated unit normal distributions restricted to $[-0.5, 2.1]$
664-
can be presented with the following distribution statement.
702+
can be coded with the following distribution statement.
665703

666704
```stan
667705
y ~ normal(0, 1) T[-0.5, 2.1];
@@ -839,8 +877,8 @@ The equivalent code for a vectorized truncation depends on which of the
839877
variables are non-scalars (arrays, vectors, etc.):
840878

841879
1. If the variate `y` is the only non-scalar, the result is the same as
842-
described in the above sections, but the `lcdf`/`lccdf` calculation is multiplied
843-
by `size(y)`.
880+
described in the above sections, but the `lcdf`/`lccdf` calculation is
881+
multiplied by `size(y)`.
844882

845883
2. If the other arguments to the distribution are non-scalars, then the
846884
vectorized version of the `lcdf`/`lccdf` is used. These functions return the
@@ -973,7 +1011,8 @@ for (y in ys) {
9731011
}
9741012
```
9751013

976-
The order in which elements of `ys` are visited is defined for container types as follows.
1014+
The order in which elements of `ys` are visited is defined for
1015+
container types as follows.
9771016

9781017
* `vector`, `row_vector`: elements visited in order, `y` is of type `double`
9791018

@@ -1442,7 +1481,8 @@ program. They are particularly useful for spotting problematic
14421481
not-a-number of infinite values, both of which will be printed.
14431482

14441483
It is particularly useful to print the value of the target log
1445-
density accumulator (through the `target()` function), as in the following example.
1484+
density accumulator (through the `target()` function), as in the
1485+
following example.
14461486

14471487
```stan
14481488
vector[2] y;

0 commit comments

Comments
 (0)