Skip to content

Commit 1782f1a

Browse files
authored
Merge pull request #807 from stan-dev/feature/row-stochastic-matrix
row/col stochastic matrix documentation
2 parents 74557ef + f49798c commit 1782f1a

File tree

2 files changed

+143
-0
lines changed

2 files changed

+143
-0
lines changed

src/reference-manual/transforms.qmd

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -754,6 +754,72 @@ z_k
754754
.
755755
$$
756756

757+
## Stochastic Matrix {#stochastic-matrix-transform.section}
758+
759+
The `column_stochastic_matrix[N, M]` and `row_stochastic_matrix[N, M]` type in
760+
Stan represents an \(N \times M\) matrix where each column (row) is a unit simplex
761+
of dimension \(N\). In other words, each column (row) of the matrix is a vector
762+
constrained to have non-negative entries that sum to one.
763+
764+
### Definition of a Stochastic Matrix {-}
765+
766+
A column stochastic matrix \(X \in \mathbb{R}^{N \times M}\) is defined such
767+
that each column is a simplex. For column \(m\) (where \(1 \leq m \leq M\)):
768+
769+
$$
770+
X_{n, m} \geq 0 \quad \text{for } 1 \leq n \leq N,
771+
$$
772+
773+
and
774+
775+
$$
776+
\sum_{n=1}^N X_{n, m} = 1.
777+
$$
778+
779+
A row stochastic matrix is any matrix whose transpose is a column stochastic matrix
780+
(i.e. the rows of the matrix are simplexes)
781+
782+
783+
$$
784+
X_{n, m} \geq 0 \quad \text{for } 1 \leq n \leq N,
785+
$$
786+
787+
and
788+
789+
$$
790+
\sum_{m=1}^N X_{n, m} = 1.
791+
$$
792+
793+
This definition ensures that each column (row) of the matrix \(X\) lies on the
794+
\(N-1\) dimensional unit simplex, similar to the `simplex[N]` type, but
795+
extended across multiple columns(rows).
796+
797+
### Inverse Transform for Stochastic Matrix {-}
798+
799+
For the column and row stochastic matrices the inverse transform is the same
800+
as simplex, but applied to each column (row).
801+
802+
### Absolute Jacobian Determinant for the Inverse Transform {-}
803+
804+
The Jacobian determinant of the inverse transform for each column \(m\) in
805+
the matrix is given by the product of the diagonal entries \(J_{n, m}\) of
806+
the lower-triangular Jacobian matrix. This determinant is calculated as:
807+
808+
$$
809+
\left| \det J_m \right| = \prod_{n=1}^{N-1} \left( z_{n, m} (1 - z_{n, m}) \left( 1 - \sum_{n'=1}^{n-1} X_{n', m} \right) \right).
810+
$$
811+
812+
Thus, the overall Jacobian determinant for the entire `column_stochastic_matrix` and `row_stochastic_matrix`
813+
is the product of the determinants for each column (row):
814+
815+
$$
816+
\left| \det J \right| = \prod_{m=1}^{M} \left| \det J_m \right|.
817+
$$
818+
819+
### Transform for Stochastic Matrix {-}
820+
821+
For the column and row stochastic matrices the transform is the same
822+
as simplex, but applied to each column (row).
757823

758824
## Unit vector {#unit-vector.section}
759825

src/reference-manual/types.qmd

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -674,6 +674,83 @@ iterations, and in either case, with less dispersed parameter
674674
initialization or custom initialization if there are informative
675675
priors for some parameters.
676676

677+
### Stochastic Matrices {-}
678+
679+
A stochastic matrix is a matrix where each column or row is a
680+
unit simplex, meaning that each column (row) vector has non-negative
681+
values that sum to 1. The following example is a \(3 \times 4\)
682+
column-stochastic matrix.
683+
684+
$$
685+
\begin{bmatrix}
686+
0.2 & 0.5 & 0.1 & 0.3 \\
687+
0.3 & 0.3 & 0.6 & 0.4 \\
688+
0.5 & 0.2 & 0.3 & 0.3
689+
\end{bmatrix}
690+
$$
691+
692+
An example of a \(3 \times 4\) row-stochastic matrix is the following.
693+
694+
$$
695+
\begin{bmatrix}
696+
0.2 & 0.5 & 0.1 & 0.2 \\
697+
0.2 & 0.1 & 0.6 & 0.1 \\
698+
0.5 & 0.2 & 0.2 & 0.1
699+
\end{bmatrix}
700+
$$
701+
702+
703+
In the examples above, each column (or row) sums to 1, making the matrices
704+
valid `column_stochastic_matrix` and `row_stochastic_matrix` types.
705+
706+
Column-stochastic matrices are often used in models where
707+
each column represents a probability distribution across a
708+
set of categories such as in multiple multinomial distributions,
709+
factor models, transition matrices in Markov models,
710+
or compositional data analysis.
711+
They can also be used in situations where you need multiple simplexes
712+
of the same dimensionality.
713+
714+
The `column_stochastic_matrix` and `row_stochastic_matrix` types are declared
715+
with row and column sizes. For instance, a matrix `theta` with
716+
3 rows and 4 columns, where each
717+
column is a 3-simplex, is declared like a matrix with 3 rows and 4 columns.
718+
719+
```stan
720+
column_stochastic_matrix[3, 4] theta;
721+
```
722+
723+
A matrix `theta` with 3 rows and 4 columns, where each row is a 4-simplex,
724+
is similarly declared as a matrix with 3 rows and 4 columns.
725+
726+
```stan
727+
row_stochastic_matrix[3, 4] theta;
728+
```
729+
730+
As with simplexes, `column_stochastic_matrix` and `row_stochastic_matrix`
731+
variables are subject to validation, ensuring that each column (row)
732+
satisfies the simplex constraints. This validation accounts for
733+
floating-point imprecision, with checks performed up to a statically
734+
specified accuracy threshold \(\epsilon\).
735+
736+
#### Stability Considerations {-}
737+
738+
In high-dimensional settings, `column_stochastic_matrix` and `row_stochastic_matrix`
739+
types may require careful tuning of the inference
740+
algorithms. To ensure stability:
741+
742+
- **Smaller Step Sizes:** In samplers like Hamiltonian Monte Carlo (HMC),
743+
smaller step sizes can help maintain stability, especially in high dimensions.
744+
- **Higher Target Acceptance Rates:** Setting higher target acceptance
745+
rates can improve the robustness of the sampling process.
746+
- **Longer Warmup Periods:** Increasing the warmup period allows the sampler
747+
to better explore the parameter space before the actual sampling begins.
748+
- **Tighter Optimization Tolerances:** For optimization-based inference,
749+
tighter tolerances with more iterations can yield more accurate results.
750+
- **Custom Initialization:** If prior information about the parameters is
751+
available, custom initialization or less dispersed initialization can lead
752+
to more efficient inference.
753+
677754
### Unit vectors {-}
678755

679756
A unit vector is a vector with a norm of one. For instance, $[0.5,

0 commit comments

Comments
 (0)