update with @bob-carpenter review comments

SteveBronder · SteveBronder · commit 84af3f8ae7c0 · 2024-09-16T16:31:00.000-04:00
diff --git a/src/reference-manual/transforms.qmd b/src/reference-manual/transforms.qmd
@@ -15,7 +15,7 @@ constrained to be ordered, positive ordered, or simplexes.  Matrices
 may be constrained to be correlation matrices or covariance matrices.
 This chapter provides a definition of the transforms used for each
 type of variable.   For examples of how to declare and define these
-variables in a Stan program, see section 
+variables in a Stan program, see section
 [Variable declaration](types.qmd#variable-declaration.section).
 
 Stan converts models to C++ classes which define probability
@@ -39,7 +39,7 @@ the exact arithmetic:
     rounded to the boundary. This may cause unexpected warnings or
     errors, if in other parts of the code the boundary value is
     invalid. For example, we may observe floating-point value 0 for
-    a variance parameter that has been declared to be larger than 0. 
+    a variance parameter that has been declared to be larger than 0.
 	See more about [Floating point Arithmetic in Stan user's guide](../stan-users-guide/floating-point.qmd)).
   - CmdStan stores the output to CSV files with 6 significant digits
     accuracy by default, but the constraints are checked with 8
@@ -674,61 +674,61 @@ $$
 
 ## Stochastic Matrix {#stochastic-matrix-transform.section}
 
-The `column_stochastic_matrix[N, M]` and `row_stochastic_matrix[N, M]` type in 
-Stan represents an \(N \times M\) matrix where each column(row) is a unit simplex 
-of dimension \(N\). In other words, each column(row) of the matrix is a vector 
+The `column_stochastic_matrix[N, M]` and `row_stochastic_matrix[N, M]` type in
+Stan represents an \(N \times M\) matrix where each column (row) is a unit simplex
+of dimension \(N\). In other words, each column (row) of the matrix is a vector
 constrained to have non-negative entries that sum to one.
 
 ### Definition of a Stochastic Matrix {-}
 
-A column stochastic matrix \(X \in \mathbb{R}^{N \times M}\) is defined such 
-that for each column \(j\) (where \(1 \leq j \leq M\)):
+A column stochastic matrix \(X \in \mathbb{R}^{N \times M}\) is defined such
+that each column is a simplex. For column \(j\) (where \(1 \leq j \leq M\)):
 
 $$
-X_{ij} \geq 0 \quad \text{for } 1 \leq i \leq N,
+X_{i, j} \geq 0 \quad \text{for } 1 \leq i \leq N,
 $$
 
 and
 
 $$
-\sum_{i=1}^N X_{ij} = 1.
+\sum_{i=1}^N X_{i, j} = 1.
 $$
 
-A row stochastic matrix is defined similarly but with the axis flipped such
-that 
+A row stochastic matrix is any matrix whose transpose is a column stochastic matrix
+(i.e. the rows of the matrix are simplexes)
 
 
 $$
-X_{ij} \geq 0 \quad \text{for } 1 \leq j \leq N,
+X_{i, j} \geq 0 \quad \text{for } 1 \leq j \leq N,
 $$
 
 and
 
 $$
-\sum_{j=1}^N X_{ij} = 1.
+\sum_{j=1}^N X_{i, j} = 1.
 $$
 
-This definition ensures that each column(row) of the matrix \(X\) lies on the 
-\(N-1\) dimensional unit simplex, similar to the `simplex[N]` type, but 
+This definition ensures that each column (row) of the matrix \(X\) lies on the
+\(N-1\) dimensional unit simplex, similar to the `simplex[N]` type, but
 extended across multiple columns(rows).
 
 ### Inverse Transform for Stochastic Matrix {-}
 
 For the column and row stochastic matrices the inverse transform is the same
-as simplex, but applied to each column(row).
+as simplex, but applied to each column (row).
 
 ### Absolute Jacobian Determinant for the Inverse Transform {-}
 
-The Jacobian determinant of the inverse transform for each column \(j\) in 
-the matrix is given by the product of the diagonal entries \(J_{i,i,j}\) of 
+The Jacobian determinant of the inverse transform for each column \(j\) in
+the matrix is given by the product of the diagonal entries \(J_{i,i,j}\) of
 the lower-triangular Jacobian matrix. This determinant is calculated as:
 
 $$
-\left| \det J_j \right| = \prod_{i=1}^{N-1} \left( z_{ij} (1 - z_{ij}) \left( 1 - \sum_{i'=1}^{i-1} X_{i'j} \right) \right).
+\left| \det J_j \right| = \prod_{i=1}^{N-1} \left( z_{i, j} (1 - z_{i, j}) \left( 1 - \sum_{i'=1}^{i-1} X_{i'j} \right) \right).
 $$
 
-Thus, the overall Jacobian determinant for the entire `column_stochastic_matrix` and `row_stochastic_matrix` 
-is the product of the determinants for each column(row):
+Thus, the overall Jacobian determinant for the entire `column_stochastic_matrix` and `row_stochastic_matrix`
+is the product of the determinants for each column (row):
 
 $$
 \left| \det J \right| = \prod_{j=1}^{M} \left| \det J_j \right|.
@@ -737,7 +737,7 @@ $$
 ### Transform for Stochastic Matrix {-}
 
 For the column and row stochastic matrices the transform is the same
-as simplex, but applied to each column(row).
+as simplex, but applied to each column (row).
 
 ## Unit vector {#unit-vector.section}
 
diff --git a/src/reference-manual/types.qmd b/src/reference-manual/types.qmd
@@ -675,10 +675,10 @@ priors for some parameters.
 
 ### Stochastic Matrices {-}
 
-A stochastic matrix is a matrix where each column, row, or both is a 
-unit simplex, meaning that each column(row) vector has non-negative 
-values that sum to 1. For example, a \(3 \times 4\) 
-column stochastic matrix will look like:
+A stochastic matrix is a matrix where each column, row, or both is a
+unit simplex, meaning that each column (row) vector has non-negative
+values that sum to 1. The following example is a \(3 \times 4\)
+column-stochastic matrix.
 
 $$
 \begin{bmatrix}
@@ -688,7 +688,7 @@ $$
 \end{bmatrix}
 $$
 
-While a row stochastic matrix will look like:
+An example of a \(3 \times 4\) row-stochastic matrix is the following.
 
 $$
 \begin{bmatrix}
@@ -699,54 +699,55 @@ $$
 $$
 
 
-In this example, each column(row) sums to 1, making the matrix a 
-valid `column_stochastic_matrix` and `row_stochastic_matrix`.
+In the examples above, each column (or row) sums to 1, making the matrices
+valid `column_stochastic_matrix` and `row_stochastic_matrix` types.
 
-Column stochastic matrices are often used in models where 
-each column represents a probability distribution across a 
-set of categories, such as in multiple multinomial distributions, 
-transition matrices in Markov models, or compositional data analysis. 
-They can also be used in situations where multiple Dirichlet-distributed v
-ariables are required across different dimensions.
+Column-stochastic matrices are often used in models where
+each column represents a probability distribution across a
+set of categories such as in multiple multinomial distributions,
+factor models, transition matrices in Markov models,
+or compositional data analysis.
+They can also be used in situations where you need multiple simplexes
+of the same dimensionality.
 
-The `column_stochastic_matrix` and `row_stochastic_matrix` types are declared 
-with full dimensionality. For instance, a matrix `theta` with 
-3 rows and 4 columns, where each 
-column is a 3-simplex, is declared as:
+The `column_stochastic_matrix` and `row_stochastic_matrix` types are declared
+with row and column sizes. For instance, a matrix `theta` with
+3 rows and 4 columns, where each
+column is a 3-simplex, is declared like a matrix with 3 rows and 4 columns.
 
 ```stan
 column_stochastic_matrix[3, 4] theta;
 ```
 
-A matrix `theta` with 
-3 rows and 4 columns, where each 
-row is a 4-simplex, is declared as:
+A matrix `theta` with 3 rows and 4 columns, where each row is a 4-simplex,
+is similarly declared as a matrix with 3 rows and 4 columns.
 
 ```stan
 row_stochastic_matrix[3, 4] theta;
 ```
 
-As with simplexes, `column(row)_stochastic_matrix` variables are subject to 
-validation, ensuring that each column(row) satisfies the simplex constraints. 
-This validation accounts for floating-point imprecision, with checks 
-performed up to a statically specified accuracy threshold \(\epsilon\).
+As with simplexes, `column_stochastic_matrix` and `row_stochastic_matrix`
+variables are subject to validation, ensuring that each column (row)
+satisfies the simplex constraints. This validation accounts for
+floating-point imprecision, with checks performed up to a statically
+specified accuracy threshold \(\epsilon\).
 
 #### Stability Considerations {-}
 
-In high-dimensional settings or when the matrix has many columns, 
-`column_stochastic_matrix` types may require careful tuning of the inference 
+In high-dimensional settings, `column_stochastic_matrix` and `row_stochastic_matrix`
+types may require careful tuning of the inference
 algorithms. To ensure stability:
 
-- **Smaller Step Sizes:** In samplers like Hamiltonian Monte Carlo (HMC), 
+- **Smaller Step Sizes:** In samplers like Hamiltonian Monte Carlo (HMC),
 smaller step sizes can help maintain stability, especially in high dimensions.
-- **Higher Target Acceptance Rates:** Setting higher target acceptance 
+- **Higher Target Acceptance Rates:** Setting higher target acceptance
 rates can improve the robustness of the sampling process.
-- **Longer Warmup Periods:** Increasing the warmup period allows the sampler 
+- **Longer Warmup Periods:** Increasing the warmup period allows the sampler
 to better explore the parameter space before the actual sampling begins.
-- **Tighter Optimization Tolerances:** For optimization-based inference, 
+- **Tighter Optimization Tolerances:** For optimization-based inference,
 tighter tolerances with more iterations can yield more accurate results.
-- **Custom Initialization:** If prior information about the parameters is 
-available, custom initialization or less dispersed initialization can lead 
+- **Custom Initialization:** If prior information about the parameters is
+available, custom initialization or less dispersed initialization can lead
 to more efficient inference.
 
 ### Unit vectors {-}