Support numerical evaluation of expectations #19

thomasahle · 2025-01-06T12:41:26Z

Motivation

Currently, the library can handle symbolic expressions for Tensors and can represent expectations (via the Expectation class). However, there's no fully unified method to numerically evaluate those expectations by sampling the distribution of a variable (e.g. a Gaussian) while keeping everything else the same. For many practical applications, it would be helpful to do Monte Carlo approximation of (\mathbb{E}[f(X)]) by adding a "samples" (or "batch") dimension to (X), then averaging over that dimension.

The Problem

When we inject an extra "samples" dimension into the numeric tensor for (X), it no longer strictly matches the symbolic shape ((i, j, ...)). For instance:

Symbolically, X might be declared as Variable("X", i, j).
Numerically, we produce a tensor shaped ((\text{samples}, i, j)) for sampling.

Because much of the code assumes that the numeric tensor has exactly the same named edges as the symbolic expression, we get:

KeyErrors if the code attempts old_to_new[e] for e == "samples".
align_to errors if "samples" is not part of the list of symbolic edges.
Dimension mismatch checks in _inner_evaluate or rename(...) blocks that fail when extra dimensions exist.

Proposed Directions

Partial Fix: Ellipses and Skipping Unknown Dims
One approach is to:
- Use .align_to(..., *self.edges) so that leftover dims like "samples" are tolerated.
- Skip unrecognized dims in rename checks, e.g. if a dim is not in v.orig, we ignore it.
  This “patch” approach does let code run without error, but the symbolic shapes are still (i, j) while numeric shapes are (samples, i, j). The library is unaware of the extra dimension except for scattered “just skip it” logic.
Symbolic Substitution / Broadcasting
A more complete fix would be:
- When we decide to do Monte Carlo, replace X with a new Variable("X_samples", "samples", i, j) throughout the expression.
- That way, the shape is truly (samples, i, j) in the symbolic expression itself.
- Everything else in the library remains consistent: no rename or align issues, because the expression literally expects a "samples" edge now.
- Downside: This can be complex. We’d have to modify every sub-expression referencing X, handle potential edge collisions, etc.
Dedicated “Batch” or “Sample” Concept
A more architectural approach might define a “batch dimension” system at the library’s top level. Each Tensor can have zero or more batch dims plus its symbolic dims. Then the code knows about that distinction and handles it systematically.

Next Steps

Decide whether a short-term patch (skipping unrecognized dims, using ellipses) is enough, or if a more thorough symbolic approach is warranted.
Possibly adopt the substitution idea so we truly unify the shape (samples, i, j) both numerically and symbolically.

Comments, suggestions, and PRs are welcome to refine how we handle numeric evaluation of expectations with extra sample dimensions.

The text was updated successfully, but these errors were encountered:

thomasahle · 2025-01-10T12:30:59Z

A similar idea is to add numerical differentiation to the Derivative class.

thomasahle added the enhancement New feature or request label Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support numerical evaluation of expectations #19

Support numerical evaluation of expectations #19

thomasahle commented Jan 6, 2025

thomasahle commented Jan 10, 2025