You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The `normal_lupdf` function returns the log density of an unnormalized distribution.
322
-
With the unnormalized version of the function, Stan does not define what the
323
-
normalization constant will be, though usually as many terms as possible are dropped
324
-
to make the calculation fast. Dropping a constant `sigma` term, `normal_lupdf` would
325
-
be equivalent to:
321
+
The `normal_lupdf` function returns the log density of an unnormalized
322
+
distribution. With the unnormalized version of the function, Stan
323
+
does not define what the normalization constant will be, though
324
+
usually as many terms as possible are dropped to make the calculation
325
+
fast. Dropping a constant `sigma` term, `normal_lupdf` would be
326
+
equivalent to:
326
327
327
328
$$
328
329
\textsf{normal\_lupdf}(x | \mu, \sigma) =
@@ -376,25 +377,29 @@ y ~ normal(mu, sigma);
376
377
mu ~ normal(0, 10);
377
378
sigma ~ normal(0, 1);
378
379
```
379
-
The symbol $\sim$ is called tilde. Due to historical reasons, the distribution statements used to be called "sampling statements" in Stan, but that term is not recommended anymore as it is less accurate description.
380
+
The symbol $\sim$ is called tilde. Due to historical reasons, the
381
+
distribution statements used to be called "sampling statements" in
382
+
Stan, but that term is not recommended anymore as it is less accurate
383
+
description.
380
384
381
-
In general, we can read $\sim$ as "is distributed as," and overall this notation is used as a shorthand for defining distributions as
385
+
In general, we can read $\sim$ as "is distributed as," and overall
386
+
this notation is used as a shorthand for defining distributions, so
In general, the product of arbitrary probability density functions is not a normalized probability density function---that is, it will be positive but will not in general integrate to 1---but the proportionality is sufficient for the Stan algorithms.
394
400
395
-
Stan always constructs the target function---in Bayesian terms, the log posterior density function of the parameter vector---by adding terms in the model block. Equivalently, each $\sim$ statement corresponds to a multiplicative factor in the unnormalized posterior density.
396
-
397
-
This works even if the model is not constructed generatively. For example, suppose you include the following code in a Stan model:
401
+
This works even if the model is not constructed generatively. For
402
+
example, suppose you include the following code in a Stan model:
which in this case is $\mathrm{normal}(a|0,1/\sqrt{2})$. One might expect that the above two lines of code would represent a redundant expression of a $\mathrm{normal}(a|0,1)$ prior, but, no, each line of code corresponds to an additional term in the target, or log posterior, density. You can think of each line as representing an additional piece of information.
407
-
408
-
Distribution statement `... ~ ...` accepts only distributions on the right side. These distributions can be built in or user defined distributions. The left side of a distribution statement may be data, parameter, or a complex expression, but the evaluated type needs to match one of the allowed type of the right hand side distribution (see more below).
409
-
410
-
In Stan, a distribution statement is merely a notational convenience following the typical
411
-
notation used to present models in the literature. The above
412
-
model defined with distribution statements could be expressed as a direct increment on the
413
-
total log probability density as
411
+
which in this case is $\mathrm{normal}(a|0,1/\sqrt{2})$. One might
412
+
expect that the above two lines of code would represent a redundant
413
+
expression of a $\mathrm{normal}(a|0,1)$ prior, but, no, each line of
414
+
code corresponds to an additional term in the target, or log posterior
415
+
density. You can think of each line as representing an additional
416
+
piece of information.
417
+
418
+
When the joint distribution is considered as a function of parameters
419
+
(e.g. $\mu$, $\sigma$) given fixed data, it is proportional to
420
+
posterior distribution. In general, the posterior distribution is not
421
+
a normalized probability density function---that is, it will be
422
+
positive but will not in general integrate to 1---but the
423
+
proportionality is sufficient for the Stan algorithms.
424
+
425
+
Stan always constructs the target function---in Bayesian terms, the
426
+
log posterior density function of the parameter vector---by adding
427
+
terms in the model block. Equivalently, each $\sim$ statement
428
+
corresponds to a multiplicative factor in the unnormalized posterior
429
+
density.
430
+
431
+
Distribution statement `... ~ ...` accepts only distributions on the
432
+
right side. These distributions can be built in or user defined
433
+
distributions. The left side of a distribution statement may be data,
434
+
parameter, or a complex expression, but the evaluated type needs to
435
+
match one of the allowed type of the right hand side distribution (see
436
+
more below).
437
+
438
+
In Stan, a distribution statement is merely a notational convenience
439
+
following the typical notation used to present models in the
440
+
literature. The above model defined with distribution statements
441
+
could be expressed as a direct increment on the total log probability
442
+
density as
414
443
415
444
```stan
416
445
target += normal_lpdf(y | mu, sigma);
417
446
target += normal_lpdf(mu | 0, 10);
418
447
target += normal_lpdf(sigma | 0, 1);
419
448
```
420
449
421
-
Stan model can mix distribution statements and log probability increment
422
-
statements. Although we often prefer to present models as joint distributions, there are several cases due to computational efficiency (e.g. censored data model) or Stan language limitations (e.g. mixture models), that we may want to define the log likelihood or parts of it directly, which is possible with log probability increment statements. See also below discussion about Jacobians.
450
+
Stan models can mix distribution statements and log probability
451
+
increment statements. Although in the literature statistical models
452
+
are usually defined with distributions, there are several cases due to
453
+
computational efficiency (e.g. censored data model) or coding language
454
+
limitations (e.g. mixture models in Stan), that we may want to code
455
+
the log likelihood or parts of it directly, which is possible with log
456
+
probability increment statements. See the discussion below about
457
+
Jacobians.
423
458
424
459
In general, a distribution statement of the form
425
460
@@ -474,13 +509,16 @@ terms. Therefore, the explicit increment form can be used to recreate
474
509
the exact log probability values for the model. Otherwise, the
475
510
distribution statement form will be faster if any of the input expressions,
476
511
`y`, `mu`, or `sigma`, involve only constants, data
477
-
variables, and transformed data variables.
512
+
variables, and transformed data variables. See the section
513
+
[#built-in-distributions](Built in distributions) above discussing
514
+
`_lupdf` and `_lupmf` functions that also drops all the constant terms.
478
515
479
516
480
517
### User-transformed variables {-}
481
518
482
-
The left-hand side of a distribution statement may be a complex
483
-
expression. For instance, it is legal syntactically to write
519
+
The left-hand side of a distribution statement may be an arbitrary
520
+
expression (of compatible type)". For instance, it is legal
521
+
syntactically to write
484
522
485
523
```stan
486
524
parameters {
@@ -661,7 +699,7 @@ $$
661
699
662
700
Stan allows probability functions to be truncated. For example, a
663
701
truncated unit normal distributions restricted to $[-0.5, 2.1]$
664
-
can be presented with the following distribution statement.
702
+
can be coded with the following distribution statement.
665
703
666
704
```stan
667
705
y ~ normal(0, 1) T[-0.5, 2.1];
@@ -839,8 +877,8 @@ The equivalent code for a vectorized truncation depends on which of the
839
877
variables are non-scalars (arrays, vectors, etc.):
840
878
841
879
1. If the variate `y` is the only non-scalar, the result is the same as
842
-
described in the above sections, but the `lcdf`/`lccdf` calculation is multiplied
843
-
by `size(y)`.
880
+
described in the above sections, but the `lcdf`/`lccdf` calculation is
881
+
multiplied by `size(y)`.
844
882
845
883
2. If the other arguments to the distribution are non-scalars, then the
846
884
vectorized version of the `lcdf`/`lccdf` is used. These functions return the
@@ -973,7 +1011,8 @@ for (y in ys) {
973
1011
}
974
1012
```
975
1013
976
-
The order in which elements of `ys` are visited is defined for container types as follows.
1014
+
The order in which elements of `ys` are visited is defined for
1015
+
container types as follows.
977
1016
978
1017
*`vector`, `row_vector`: elements visited in order, `y` is of type `double`
979
1018
@@ -1442,7 +1481,8 @@ program. They are particularly useful for spotting problematic
1442
1481
not-a-number of infinite values, both of which will be printed.
1443
1482
1444
1483
It is particularly useful to print the value of the target log
1445
-
density accumulator (through the `target()` function), as in the following example.
1484
+
density accumulator (through the `target()` function), as in the
0 commit comments