Skip to content

Commit a67ed63

Browse files
committed
start row/col stochastic matrix documentation
1 parent 5ae71eb commit a67ed63

File tree

2 files changed

+181
-0
lines changed

2 files changed

+181
-0
lines changed

src/reference-manual/transforms.qmd

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -672,6 +672,107 @@ z_k
672672
.
673673
$$
674674

675+
## Column Stochastic Matrix {#column-stochastic-matrix-transform.section}
676+
677+
The `column_stochastic_matrix[N, M]` type in Stan represents an \(N \times M\)
678+
matrix where each column is a unit simplex of dimension \(N\). In other words,
679+
each column of the matrix is a vector constrained to have non-negative entries
680+
that sum to one.
681+
682+
### Definition of a Column Stochastic Matrix {-}
683+
684+
A column stochastic matrix \(X \in \mathbb{R}^{N \times M}\) is defined such
685+
that for each column \(j\) (where \(1 \leq j \leq M\)):
686+
687+
$$
688+
X_{ij} \geq 0 \quad \text{for } 1 \leq i \leq N,
689+
$$
690+
691+
and
692+
693+
$$
694+
\sum_{i=1}^N X_{ij} = 1.
695+
$$
696+
697+
This definition ensures that each column of the matrix \(X\) lies on the
698+
\(N-1\) dimensional unit simplex, similar to the `simplex[N]` type, but
699+
extended across multiple columns.
700+
701+
### Inverse Transform for Column Stochastic Matrix {-}
702+
703+
The inverse transform for the `column_stochastic_matrix` type is an extension
704+
of the unit simplex inverse transform to multiple columns. The process can be
705+
understood by applying the stick-breaking metaphor independently to each column.
706+
707+
For each column \(j\) of the matrix \(X\), an unconstrained vector \(y_j \in
708+
\mathbb{R}^{N-1}\) is mapped to the column \(X_{\cdot j}\) on the unit simplex
709+
using the following steps:
710+
711+
1. Begin with a stick of unit length.
712+
2. Break off a piece corresponding to each element \(X_{ij}\) of the column
713+
vector, where the size of each piece is determined by an intermediate value
714+
\(z_{ij}\), which is itself derived from the unconstrained
715+
parameter \(y_{ij}\).
716+
3. The intermediate vector \(z_j \in \mathbb{R}^{N-1}\) is
717+
defined elementwise for each \(j\) and for \(1 \leq i < N\) by
718+
719+
$$
720+
z_{ij} = \mathrm{logit}^{-1}\left( y_{ij} + \log\left(\frac{1}{N - i}\right) \right).
721+
$$
722+
723+
4. The stick sizes \(X_{ij}\) for \(1 \leq i < N\) are then calculated recursively by
724+
725+
$$
726+
X_{ij} = \left( 1 - \sum_{i'=1}^{i-1} X_{i'j} \right) z_{ij}.
727+
$$
728+
729+
5. The last element of each column \(X_{Nj}\) is set to the length of the remaining piece of the stick:
730+
731+
$$
732+
X_{Nj} = 1 - \sum_{i=1}^{N-1} X_{ij}.
733+
$$
734+
735+
### Absolute Jacobian Determinant for the Inverse Transform {-}
736+
737+
The Jacobian determinant of the inverse transform for each column \(j\) in
738+
the matrix is given by the product of the diagonal entries \(J_{i,i,j}\) of
739+
the lower-triangular Jacobian matrix. This determinant is calculated as:
740+
741+
$$
742+
\left| \det J_j \right| = \prod_{i=1}^{N-1} \left( z_{ij} (1 - z_{ij}) \left( 1 - \sum_{i'=1}^{i-1} X_{i'j} \right) \right).
743+
$$
744+
745+
Thus, the overall Jacobian determinant for the entire `column_stochastic_matrix`
746+
is the product of the determinants for each column:
747+
748+
$$
749+
\left| \det J \right| = \prod_{j=1}^{M} \left| \det J_j \right|.
750+
$$
751+
752+
### Transform for Column Stochastic Matrix {-}
753+
754+
To transform from a column stochastic matrix \(X\) back to the unconstrained space \(y_j\)
755+
for each column \(j\):
756+
757+
1. The break proportions \(z_{ij}\) are first determined by
758+
759+
$$
760+
z_{ij} = \frac{X_{ij}}{1 - \sum_{i'=1}^{i-1} X_{i'j}}.
761+
$$
762+
763+
2. The corresponding unconstrained parameters \(y_{ij}\) are then computed as
764+
765+
$$
766+
y_{ij} = \mathrm{logit}(z_{ij}) - \log\left(\frac{1}{N - i}\right).
767+
$$
768+
769+
By applying this process to each column \(j\) independently, the entire column
770+
stochastic matrix \(X\) can be transformed to or from the unconstrained space.
771+
772+
This formulation allows the `column_stochastic_matrix[N, M]` type to be used
773+
effectively in models requiring columns of a matrix to be unit simplexes,
774+
such as in multi-category probability models or compositional data analysis.
775+
675776

676777
## Unit vector {#unit-vector.section}
677778

src/reference-manual/types.qmd

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -673,6 +673,86 @@ iterations, and in either case, with less dispersed parameter
673673
initialization or custom initialization if there are informative
674674
priors for some parameters.
675675

676+
### Stochastic Matrices {-}
677+
678+
A stochastic matrix is a matrix where each column, row, or both is a
679+
unit simplex, meaning that each column(row) vector has non-negative
680+
values that sum to 1. For example, a \(3 \times 4\)
681+
column stochastic matrix will look like:
682+
683+
$$
684+
\begin{bmatrix}
685+
0.2 & 0.5 & 0.1 & 0.3 \\
686+
0.3 & 0.3 & 0.6 & 0.4 \\
687+
0.5 & 0.2 & 0.3 & 0.3
688+
\end{bmatrix}
689+
$$
690+
691+
While a row stochastic matrix will look like:
692+
693+
$$
694+
\begin{bmatrix}
695+
0.2 & 0.5 & 0.1 & 0.2 \\
696+
0.2 & 0.1 & 0.6 & 0.1 \\
697+
0.5 & 0.2 & 0.2 & 0.1
698+
\end{bmatrix}
699+
$$
700+
701+
702+
In this example, each column(row) sums to 1, making the matrix a
703+
valid `column_stochastic_matrix` and `row_stochastic_matrix`.
704+
705+
Column stochastic matrices are often used in models where
706+
each column represents a probability distribution across a
707+
set of categories, such as in multiple multinomial distributions,
708+
transition matrices in Markov models, or compositional data analysis.
709+
They can also be used in situations where multiple Dirichlet-distributed v
710+
ariables are required across different dimensions.
711+
712+
The `column_stochastic_matrix` and `row_stochastic_matrix` types are declared
713+
with full dimensionality. For instance, a matrix `theta` with
714+
3 rows and 4 columns, where each
715+
column is a 3-simplex, is declared as:
716+
717+
```stan
718+
column_stochastic_matrix[3, 4] theta;
719+
```
720+
721+
A matrix `theta` with
722+
3 rows and 4 columns, where each
723+
row is a 4-simplex, is declared as:
724+
725+
```stan
726+
row_stochastic_matrix[3, 4] theta;
727+
```
728+
729+
The `column_stochastic_matrix` type is implemented as a matrix where each
730+
column is individually constrained to be a simplex. This means that each
731+
column must be a valid simplex with non-negative elements that sum to 1.
732+
733+
As with simplexes, `column_stochastic_matrix` variables are subject to
734+
validation, ensuring that each column satisfies the simplex constraints.
735+
This validation accounts for floating-point imprecision, with checks
736+
performed up to a statically specified accuracy threshold \(\epsilon\).
737+
738+
#### Stability Considerations {-}
739+
740+
In high-dimensional settings or when the matrix has many columns,
741+
`column_stochastic_matrix` types may require careful tuning of the inference
742+
algorithms. To ensure stability:
743+
744+
- **Smaller Step Sizes:** In samplers like Hamiltonian Monte Carlo (HMC),
745+
smaller step sizes can help maintain stability, especially in high dimensions.
746+
- **Higher Target Acceptance Rates:** Setting higher target acceptance
747+
rates can improve the robustness of the sampling process.
748+
- **Longer Warmup Periods:** Increasing the warmup period allows the sampler
749+
to better explore the parameter space before the actual sampling begins.
750+
- **Tighter Optimization Tolerances:** For optimization-based inference,
751+
tighter tolerances with more iterations can yield more accurate results.
752+
- **Custom Initialization:** If prior information about the parameters is
753+
available, custom initialization or less dispersed initialization can lead
754+
to more efficient inference.
755+
676756
### Unit vectors {-}
677757

678758
A unit vector is a vector with a norm of one. For instance, $[0.5,

0 commit comments

Comments
 (0)