Skip to content

document sum to zero matrix #867

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
spinkney opened this issue Mar 25, 2025 · 5 comments
Open

document sum to zero matrix #867

spinkney opened this issue Mar 25, 2025 · 5 comments
Assignees

Comments

@spinkney
Copy link
Collaborator

Once stan-dev/math#3169 is in I'll need to document this.

@spinkney
Copy link
Collaborator Author

spinkney commented Mar 25, 2025

Doc thoughts

The sum_to_zero_matrix is a two-dimensional generalization of the sum_to_zero_vector, using the same Helmert-type transform to enforce the columns to be mutually orthonormal in $\mathbb{R}^{ (N + 1) \times (M + 1) }$.


  1. Define a 1D Helmert basis in dimension (N+1) by

$$ \boxed{ u_{k}(i) \ = \begin{cases} 1/\sqrt{k \ (\ k+1 )}, & \text{if } i \le k,\\ -k/ \sqrt{k\ (\ k+1 \ )}, & \text{if } i = k+1,\\ 0, & \text{if } i > k+1, \end{cases} \quad \text{for }k=1,\ldots, N, \ \ i=1,\ldots, N+1. } $$

  1. Define a 1D Helmert basis in dimension (M+1) by an exactly analogous formula:

$$ v_{\ell}(j) \ = \begin{cases} 1/\sqrt{\ell,(,\ell+1,)}, & \text{if } j \le \ell, \\ -\ \ell/ \sqrt{\ell,(\ \ ell+1 )}, & \text{if } j = \ell+1, \\ 0, & \text{if } j > \ell+1, \end{cases} \quad \ell=1,\ldots,M,\ \ j=1,\ldots,M+1. $$

  1. Form the 2D basis by tensor‐products:

$$ h_{k,\ell}(i,j) \ = u_{k}(i) \ \times v_{\ell}(j), $$

giving $k=1, \ldots,N$ and $\ell=1,\ldots,M$. Each $h_{k,\ell}$ is an $(N+1) \ \times \ (M+1)$ “contrast pattern” with row and column sums $= 0$. These $NM$ matrices ${h_{k,\ell}}$ are orthonormal in $\mathbb{R}^{(N+1)\times(M+1)}$.

If one wanted a ND basis then forming the ND tensor-product of the Helmert bases will result in the correct orthonormal transform.

Below is one possible way to present the mathematics behind your Helmert‐like row‐and‐column‐constraining transform in a style reminiscent of a Stan User Guide, but without actual C++ code. The loops are written purely in mathematical notation, showing how each element of the output (Z) is computed.


Mathematical Description of the Transform

We have an input matrix

$$ x \in \mathbb{R}^{N \times M}, $$

and we wish to construct an output matrix

$$ Z \in \mathbb{R}^{(N+1)\times(M+1)}, $$

subject to the row‐sum‐zero and column‐sum‐zero constraints. Define vectors

$$ \beta \in \mathbb{R}^N, \quad \beta_i = 0 \quad \text{for } i = 0,\dots, N-1, $$

and initialize

$$ Z_{p,q} = 0 \quad \text{for all } 0 \le p \le N, \quad 0 \le q \le M. $$

Then proceed as follows:

Descending loop over columns $j = M-1, M-2, \ldots, 0$:

Define

$$ a_j = \frac{1}{\sqrt{(j+1)(j+2)}}, \quad b_j = (j + 1),a_j. $$

Initialize an accumulator $ \mathrm{ax_prev} = 0 $.

Set

$$ Z_{N,j} = 0. $$

Inside each column, descending loop over rows $i = N-1, N-2, \dots, 0$:

Define

$$ a_i = \frac{1}{\sqrt{(i+1)(i+2)}}, \quad b_i = (i + 1),a_i. $$

Form the partial combination

$$ b_i , x_{i,j} - \mathrm{ax_prev} = b_i , x_{i,j} - \mathrm{ax_prev}. $$

Denote this quantity by $b_{i_x}$.

Update the output:

$$ Z_{i,j} = b_j ,\bigl(b_{i_x} bigr) - \beta_i. $$

Update the $\beta$ vector:

$$ \beta_i \leftarrow \beta_i + a_j ,\bigl(b_{i_x} \bigr). $$

Impose the row‐sum and column‐sum constraints:

$$ Z_{N,j} \leftarrow Z_{N,j} - Z_{i,j}, \quad Z_{i,M} \leftarrow Z_{i,M} - Z_{i,j}. $$

Accumulate

$$ \mathrm{ax_prev} \leftarrow \mathrm{ax_prev} + a_i , x_{i,j}. $$

After finishing all rows $i$ for a given column $j$, update

$$ Z_{N,M} \leftarrow Z_{N,M} - Z_{N,j}. $$

This ensures the bottom‐right corner reflects the column sum constraint in the last row.

At the end of these nested loops, the matrix $Z$ satisfies

$$ \sum_{k=0}^N Z_{k,j} = 0 \quad \forall, j=0,\dots,M, \quad \text{and} \quad \sum_{\ell=0}^M Z_{i,\ell} = 0 \quad \forall, i=0,\dots,N. $$

That is, each row and column of $Z$ sums to zero. The particular constants ${a_i, b_i}$ and ${a_j, b_j}$ ensure an orthonormal (Helmert‐type) structure when viewed as a linear operator from $\mathbb{R}^{N\times M}$ to $\mathbb{R}^{(N+1)\times(M+1)}$.

@WardBrian
Copy link
Member

As an additional note: besides just the new type, we will also want to document the new overloads to functions like sum_to_zero_constrain on their page

@WardBrian
Copy link
Member

It's all merged now -- @spinkney do you want to take the first crack? I can add the new function overloads to whatever branch you work on

@WardBrian
Copy link
Member

While we're at it, there's a typo in the sum to zero vector: https://discourse.mc-stan.org/t/constraint-transforms/39384

@spinkney
Copy link
Collaborator Author

spinkney commented May 6, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants