You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the library can handle symbolic expressions for Tensors and can represent expectations (via the Expectation class). However, there's no fully unified method to numerically evaluate those expectations by sampling the distribution of a variable (e.g. a Gaussian) while keeping everything else the same. For many practical applications, it would be helpful to do Monte Carlo approximation of (\mathbb{E}[f(X)]) by adding a "samples" (or "batch") dimension to (X), then averaging over that dimension.
The Problem
When we inject an extra "samples" dimension into the numeric tensor for (X), it no longer strictly matches the symbolic shape ((i, j, ...)). For instance:
Symbolically, X might be declared as Variable("X", i, j).
Numerically, we produce a tensor shaped ((\text{samples}, i, j)) for sampling.
Because much of the code assumes that the numeric tensor has exactly the same named edges as the symbolic expression, we get:
KeyErrors if the code attempts old_to_new[e] for e == "samples".
align_to errors if "samples" is not part of the list of symbolic edges.
Dimension mismatch checks in _inner_evaluate or rename(...) blocks that fail when extra dimensions exist.
Proposed Directions
Partial Fix: Ellipses and Skipping Unknown Dims
One approach is to:
Use .align_to(..., *self.edges) so that leftover dims like "samples" are tolerated.
Skip unrecognized dims in rename checks, e.g. if a dim is not in v.orig, we ignore it.
This “patch” approach does let code run without error, but the symbolic shapes are still (i, j) while numeric shapes are (samples, i, j). The library is unaware of the extra dimension except for scattered “just skip it” logic.
Symbolic Substitution / Broadcasting
A more complete fix would be:
When we decide to do Monte Carlo, replace X with a new Variable("X_samples", "samples", i, j) throughout the expression.
That way, the shape is truly (samples, i, j) in the symbolic expression itself.
Everything else in the library remains consistent: no rename or align issues, because the expression literally expects a "samples" edge now.
Downside: This can be complex. We’d have to modify every sub-expression referencing X, handle potential edge collisions, etc.
Dedicated “Batch” or “Sample” Concept
A more architectural approach might define a “batch dimension” system at the library’s top level. Each Tensor can have zero or more batch dims plus its symbolic dims. Then the code knows about that distinction and handles it systematically.
Next Steps
Decide whether a short-term patch (skipping unrecognized dims, using ellipses) is enough, or if a more thorough symbolic approach is warranted.
Possibly adopt the substitution idea so we truly unify the shape (samples, i, j) both numerically and symbolically.
Comments, suggestions, and PRs are welcome to refine how we handle numeric evaluation of expectations with extra sample dimensions.
The text was updated successfully, but these errors were encountered:
Motivation
Currently, the library can handle symbolic expressions for Tensors and can represent expectations (via the
Expectation
class). However, there's no fully unified method to numerically evaluate those expectations by sampling the distribution of a variable (e.g. a Gaussian) while keeping everything else the same. For many practical applications, it would be helpful to do Monte Carlo approximation of (\mathbb{E}[f(X)]) by adding a"samples"
(or"batch"
) dimension to (X), then averaging over that dimension.The Problem
When we inject an extra
"samples"
dimension into the numeric tensor for (X), it no longer strictly matches the symbolic shape ((i, j, ...)). For instance:X
might be declared asVariable("X", i, j)
.Because much of the code assumes that the numeric tensor has exactly the same named edges as the symbolic expression, we get:
old_to_new[e]
fore == "samples"
."samples"
is not part of the list of symbolic edges._inner_evaluate
orrename(...)
blocks that fail when extra dimensions exist.Proposed Directions
Partial Fix: Ellipses and Skipping Unknown Dims
One approach is to:
.align_to(..., *self.edges)
so that leftover dims like"samples"
are tolerated.v.orig
, we ignore it.This “patch” approach does let code run without error, but the symbolic shapes are still
(i, j)
while numeric shapes are(samples, i, j)
. The library is unaware of the extra dimension except for scattered “just skip it” logic.Symbolic Substitution / Broadcasting
A more complete fix would be:
X
with a newVariable("X_samples", "samples", i, j)
throughout the expression.(samples, i, j)
in the symbolic expression itself."samples"
edge now.X
, handle potential edge collisions, etc.Dedicated “Batch” or “Sample” Concept
A more architectural approach might define a “batch dimension” system at the library’s top level. Each
Tensor
can have zero or more batch dims plus its symbolic dims. Then the code knows about that distinction and handles it systematically.Next Steps
(samples, i, j)
both numerically and symbolically.Comments, suggestions, and PRs are welcome to refine how we handle numeric evaluation of expectations with extra sample dimensions.
The text was updated successfully, but these errors were encountered: