Skip to content

Support numerical evaluation of expectations #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
thomasahle opened this issue Jan 6, 2025 · 1 comment
Open

Support numerical evaluation of expectations #19

thomasahle opened this issue Jan 6, 2025 · 1 comment
Labels
enhancement New feature or request

Comments

@thomasahle
Copy link
Owner

Motivation

Currently, the library can handle symbolic expressions for Tensors and can represent expectations (via the Expectation class). However, there's no fully unified method to numerically evaluate those expectations by sampling the distribution of a variable (e.g. a Gaussian) while keeping everything else the same. For many practical applications, it would be helpful to do Monte Carlo approximation of (\mathbb{E}[f(X)]) by adding a "samples" (or "batch") dimension to (X), then averaging over that dimension.

The Problem

When we inject an extra "samples" dimension into the numeric tensor for (X), it no longer strictly matches the symbolic shape ((i, j, ...)). For instance:

  • Symbolically, X might be declared as Variable("X", i, j).
  • Numerically, we produce a tensor shaped ((\text{samples}, i, j)) for sampling.

Because much of the code assumes that the numeric tensor has exactly the same named edges as the symbolic expression, we get:

  • KeyErrors if the code attempts old_to_new[e] for e == "samples".
  • align_to errors if "samples" is not part of the list of symbolic edges.
  • Dimension mismatch checks in _inner_evaluate or rename(...) blocks that fail when extra dimensions exist.

Proposed Directions

  1. Partial Fix: Ellipses and Skipping Unknown Dims
    One approach is to:

    • Use .align_to(..., *self.edges) so that leftover dims like "samples" are tolerated.
    • Skip unrecognized dims in rename checks, e.g. if a dim is not in v.orig, we ignore it.
      This “patch” approach does let code run without error, but the symbolic shapes are still (i, j) while numeric shapes are (samples, i, j). The library is unaware of the extra dimension except for scattered “just skip it” logic.
  2. Symbolic Substitution / Broadcasting
    A more complete fix would be:

    • When we decide to do Monte Carlo, replace X with a new Variable("X_samples", "samples", i, j) throughout the expression.
    • That way, the shape is truly (samples, i, j) in the symbolic expression itself.
    • Everything else in the library remains consistent: no rename or align issues, because the expression literally expects a "samples" edge now.
    • Downside: This can be complex. We’d have to modify every sub-expression referencing X, handle potential edge collisions, etc.
  3. Dedicated “Batch” or “Sample” Concept
    A more architectural approach might define a “batch dimension” system at the library’s top level. Each Tensor can have zero or more batch dims plus its symbolic dims. Then the code knows about that distinction and handles it systematically.

Next Steps

  • Decide whether a short-term patch (skipping unrecognized dims, using ellipses) is enough, or if a more thorough symbolic approach is warranted.
  • Possibly adopt the substitution idea so we truly unify the shape (samples, i, j) both numerically and symbolically.

Comments, suggestions, and PRs are welcome to refine how we handle numeric evaluation of expectations with extra sample dimensions.

@thomasahle thomasahle added the enhancement New feature or request label Jan 6, 2025
@thomasahle
Copy link
Owner Author

A similar idea is to add numerical differentiation to the Derivative class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant