LLM-Assisted Rule Derivation for RxInfer.jl #501

albertpod · 2025-08-28T09:41:57Z

albertpod
Aug 28, 2025
Maintainer

RxInfer.jl has a performance bottleneck that's been bugging us for a while:
When we have analytical update rules → Lightning fast inference
When we don't have those rules → We fall back to slower methods
Specifically, for models without pre-derived update rules, we currently rely on projections method.

The fallback work, but it comes with significant speed penalties that limit scalability. Essentially, we're leaving performance on the table for entire classes of probabilistic models simply because the math hasn't been worked out yet.

A Wild Idea
Inspired by AlphaEvolve paper, and recent discussions with @murphyk we can attempt using LLM agents for mathematical derivation. The key: we have verification methods to ensure correctness, so it's not just "trust the AI."

The Vision

LLM generates candidate update rules
Our system verifies mathematical correctness
RxInfer gains fast inference for new model classes

Timeline & Collaboration
Starting mid/end-September. Looking for collaborators interested in:

Mathematical AI and LLMs
Probabilistic programming
Julia development
Pushing boundaries in automated math

Drop a comment with your background! Let's make probabilistic programming faster for everyone.

ofSingularMind · 2025-08-28T10:14:00Z

ofSingularMind
Aug 28, 2025
Collaborator

Super cool :)

0 replies

bvdmitri · 2025-08-28T10:23:43Z

bvdmitri
Aug 28, 2025
Maintainer

I like the idea, but I would like to see how The Vision # 2 can be implemented in practice. I actually want this to verify our existing rules, not only for the new ones. If we had this verifier we can practically remove all manual @test_rule and just automatically test all rules we have

1 reply

albertpod Aug 28, 2025
Maintainer Author

That's why there's a discussion.
I think two things are important here:

The update rule should be a valid Julia code (this is rather easy to check I'd say).
The result of the update corresponds to the result of MCMC (either custom or even via Turing.jl).

On 2, the way I see it (helicopter view), as a first step, the user defines a functional form of the node with interfaces and placeholders for rules and (perhaps in the first iteration) we could automatically generate a corresponding Turing.jl models (or ask user to provide ti, or maybe we can have a generic placeholder for MCMC (not sure here)) for that node.
Then the verification process becomes:

Run LLM generated message passing with the analytical rule on some test messages
Run MCMC sampling on the equivalent Turing models
Compare the resulting posterior distributions.

graph LR
    A[Node/Model Specification] --> B[LLM generated Message Passing Result]
    A --> C[MCMC Result]
    
    B --> D{Results Match?}
    C --> D
    
    D -->|Yes| E[✅ Verified]
    D -->|No| F[❌ Invalid]
    
    style E fill:
    style F fill:

murphyk · 2025-08-29T16:06:33Z

murphyk
Aug 29, 2025

Can you compute ground truth update rule using numerical integration instead of MCMC? Then you can create a labeled dataset of the form (x=messages in, y=message out). Now you just have to solve a symbolic regression problem, ie. find f such that y=f(x). For this, you can use https://github.com/MilesCranmer/SymbolicRegression.jl. (One caveat is that it might be restricted to univariate outputs.) They have also combined SR with LLMs in this paper: A. Grayeli, A. Sehgal, O. Costilla-Reyes, M. Cranmer, and S. Chaudhuri, “Symbolic regression with a learned concept library,” in *NIPS*, Sep. 2024. Available: https://arxiv.org/abs/2409.09359

…

On Thu, Aug 28, 2025 at 3:44 AM Albert ***@***.***> wrote: That's why there's a discussion. I think two things are important here: 1. The update rule should be a valid Julia code (this is rather easy to check I'd say). 2. The result of the update corresponds to the result of MCMC (either custom or even via Turing.jl). On 2, the way I see it (helicopter view), in the first iteration, the user defines a functional form of the node with interfaces and placeholders for rules and (perhaps in the first iteration) we could automatically generate a corresponding Turing.jl models (or ask user to provide ti?) for that node. Then the verification process becomes: 1. Run LLM generated message passing with the analytical rule on some test messages 2. Run MCMC sampling on the equivalent Turing models 3. Compare the resulting posterior distributions. graph LR A[Node/Model Specification] --> B[LLM generated Message Passing Result] A --> C[MCMC Result] B --> D{Results Match?} C --> D D -->|Yes| E[✅ Verified] D -->|No| F[❌ Invalid] style E fill: style F fill: Loading — Reply to this email directly, view it on GitHub <#501 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDK6EAE5BDVWF7GVNR2P4L3P3MADAVCNFSM6AAAAACFA23XCWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIMRUGQYDKNQ> . You are receiving this because you were mentioned.Message ID: ***@***.*** com>

4 replies

bvdmitri Aug 29, 2025
Maintainer

Can you compute ground truth update rule using numerical integration instead of MCMC?

For univariate distributions and simple nodes with one input and one output its probably a good idea to use numerical integration, but I'm afraid that it will not scale beyond univariate distributions and/or for more complex nodes. Numerical integration for multivariate Gaussians of size 10 is already a big problem as the complexity grows exponentially. Nodes also can have multiple inputs of different types/dimensions, would be smth like (x1 = messages in, x2 = messages in, ... xn = messages in, y = messages out). It will be similar to amortized inference where one can train a neural network on inputs and outputs.

@ismailsenoz wdyt

Nimrais Aug 29, 2025
Maintainer

Instead of comparing message passing results to MCMC samples from the generative model, we should directly verify that the proposed rule satisfies the stationarity condition of the constrained Bethe free energy. This is a more principled approach that respects the variational constraints inherent in message passing. Because I do not know how to change MCMC behaviour by imposing constraints. Is it possible in anyway?

The correct way it seems to follow the fact that results of message passing should be stationary point of the Lagrangian, so the message passing in the end should give such marginals that $\int \frac{\delta L}{\delta q}(s) \phi(s) ds = 0$ for any test function $\phi$. And that is what we need to check, not the inference in a respective Turing model.

Probably selection of these test functions is also our leverage if we can select them smartly we can avoid the integration at all — think of bump functions: they will effectivly reduce these integrals to evaluation at points. However the points where we are centring bump functions probably should be sampled somehow.

bvdmitri Aug 29, 2025
Maintainer

I think @Nimrais is right here. It would be interesting to see if we can apply symbolic methods as suggested by @murphyk there

ismailsenoz Sep 2, 2025
Maintainer

In ideal scenario what we want is to verify that LLM generated rule to compute the following sum product message update (with abuse of notation)

$$\mu(x_j) = \sum_{\substack{x_k \\ k \in \mathcal{I}}} \int f(x_1, \dots, x_N) \prod_{\substack{i \neq j \\ i \in \mathcal{J}}} \mu (x_i) \mathrm{d}x_i \,,$$

where $\mathcal{I}$ denotes the indices for discrete random variables, $\mathcal{J}$ denotes the indices for continuous random variables, $f$ is the node function and $\mu$ are messages that are possibly unnormalized (normalization might not even be possible for some messages, i.e. sigmoids) is correct in some sense. Since the case for considering messages that are not normalizable is very challenging, we should limit ourselves to consider messages that are normalizable. This means verifier can work with incoming messages that have the following property

$$\int \mu(x_j) \mathrm{d}x_j < \infty\,.$$

In this restricted scenario, I do not see any other reliable/robust method other than MCMC or sparse-grid (Smolyak) numerical integrators. I think of the following recipe for the verifier:

Sum out all the discrete random variables and obtain a joint function after summing out all the discrete random variables, i.e.

$$\tilde{f}(\mathcal{J}) \triangleq \prod_{i \in \mathcal{J}}\mu(x_i) \sum_{x_k} f(x_1, \dots, x_N) \prod_{k \in \mathcal{I}}\mu(x_k)$$

(Yes I know this is going to be slow)
2. Use HMC to obtain samples from the target $\tilde{f}(\mathcal{J})$,i.e.
3. Use the list of samples for the random variable $x_j$ compute statistics and check that they match with the statistics of the result returned by LLM.

As for symbolic regression, we might try reducing the node function to be bivariate perhaps by defining (x = [messages in1, messages in2, ...], y = messages out) in an attempt to prevent (x1 = messages in, x2 = messages in, ... xn = messages in, y = messages out) but for that we might need more functionality than that is available in SymbolicRegression.jl . I dont know enough about SR to have an in depth discussion but could work.

Why would we even try changing the MCMC behavior?

Because I do not know how to change MCMC behaviour by imposing constraints. Is it possible in anyway?

I dont think for the message computation imposing stationarity condition is the right way. For marginal computations it certainly is but for message computations I dont think so. Or did you mean fixed point conditions @Nimrais ?

@murphyk If I understand you correctly, you propose to build a dictionary of results with sparse cubature methods and then using these dictionaries to create a symbolic regression problem? I think it is worth a shot. At least we can try it on a small scale 1d problem to test.

murphyk · 2025-09-02T13:45:11Z

murphyk
Sep 2, 2025

Yes I propose you use SR on 1-2d problems where ground truth output is from integration. Then generalize to Nd by using induction, either manually or with LLM., where in latter case you provide the symbolic result for 1d as a hint , ie part of the prompt. Sent from Gmail Mobile

…

On Tue, Sep 2, 2025 at 2:35 AM Ismail Senoz ***@***.***> wrote: In ideal scenario what we want is to verify that LLM generated rule to compute the following sum product message update (with abuse of notation) $$\mu(x_j) = \sum_{\substack{x_k \\ k \in \mathcal{I}}} \int f(x_1, \dots, x_N) \prod_{\substack{i \neq j \\ i \in \mathcal{J}}} \mu (x_i) \mathrm{d}x_i \,,$$ where $\mathcal{I}$ denotes the indices for discrete random variables, $\mathcal{J}$ denotes the indices for continuous random variables, $f$ is the node function and $\mu$ are messages that are possibly unnormalized (normalization might not even be possible for some messages, i.e. sigmoids) is correct in some sense. Since the case for considering messages that are not normalizable is very challenging, we should limit ourselves to consider messages that are normalizable. This means verifier can work with incoming messages that have the following property $$\int \mu(x_j) \mathrm{d}x_j < \infty\,.$$ In this restricted scenario, I do not see any other reliable/robust method other than MCMC or sparse-grid (Smolyak) numerical integrators. I think of the following recipe for the verifier: 1. Sum out all the discrete random variables and obtain a joint function after summing out all the discrete random variables, i.e. $$\tilde{f}(\mathcal{J}) \triangleq \prod_{i \in \mathcal{J}}\mu(x_i) \sum_{x_k} f(x_1, \dots, x_N) \prod_{k \in \mathcal{I}}\mu(x_k)$$ (Yes I know this is going to be slow) 2. Use HMC to obtain samples from the target $\tilde{f}(\mathcal{J})$,i.e. 3. Use the list of samples for the random variable $x_j$ compute statistics and check that they match with the statistics of the result returned by LLM. As for symbolic regression, we might try reducing the node function to be bivariate perhaps by defining (x = [messages in1, messages in2, ...], y = messages out) in an attempt to prevent (x1 = messages in, x2 = messages in, ... xn = messages in, y = messages out) but for that we might need more functionality than that is available in SymbolicRegression.jl . I dont know enough about SR to have an in depth discussion but could work. Why would we even try changing the MCMC behavior? Because I do not know how to change MCMC behaviour by imposing constraints. Is it possible in anyway? I dont think for the message computation imposing stationarity condition is the right way. For marginal computations it certainly is but for message computations I dont think so. Or did you mean fixed point conditions @Nimrais <https://github.com/Nimrais> ? @murphyk <https://github.com/murphyk> If I understand you correctly, you propose to build a dictionary of results with sparse cubature methods and then using these dictionaries to create a symbolic regression problem? I think it is worth a shot. At least we can try it on a small scale 1d problem to test. — Reply to this email directly, view it on GitHub <#501 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDK6EFCNIW23WYHRILT3TD3QVQF7AVCNFSM6AAAAACFA23XCWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIMRYGM3TMNI> . You are receiving this because you were mentioned.Message ID: ***@***.*** com>

0 replies

albertpod · 2025-09-03T17:44:46Z

albertpod
Sep 3, 2025
Maintainer Author

Thanks for the discussion! Just to clarify: I do care about message correctness, but I want to verify it through posterior correctness rather than direct mathematical verification.
My approach: LLM generates rules → compile in RxInfer → compare posteriors with Turing.jl ground truth → iterate with feedback until they match.
Working on a small demo in the next few days to show this in action. I welcome different verification approaches - excited to see what everyone comes up with! 🚀

0 replies

murphyk · 2025-09-03T20:59:29Z

murphyk
Sep 3, 2025

FWIW, tutorial on SR https://symbolicregression2025.github.io/

…

On Wed, Sep 3, 2025 at 10:45 AM Albert ***@***.***> wrote: Thanks for the discussion! Just to clarify: I do care about message correctness, but I want to verify it through posterior correctness rather than direct mathematical verification. My approach: LLM generates rules → compile in RxInfer → compare posteriors with Turing.jl ground truth → iterate with feedback until they match. Working on a small demo in the next few days to show this in action. I welcome different verification approaches - excited to see what everyone comes up with! 🚀 — Reply to this email directly, view it on GitHub <#501 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDK6EECHNUXWFD4A5KJ3IL3Q4SKHAVCNFSM6AAAAACFA23XCWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIMRZHA4DQNY> . You are receiving this because you were mentioned.Message ID: ***@***.*** com>

0 replies

t3chw · 2025-09-04T13:49:25Z

t3chw
Sep 4, 2025

Hi @albertpod , I like the idea.
I would like to work on "Mathematical AI and LLMs" or "Probabilistic programming".
Background - Mathemathics and Computer science, (worked on bayesian, contributed to numpyro)

1 reply

bvdmitri Sep 5, 2025
Maintainer

Hey @t3chw , you don't have to ask the permission to work on this. Many people have contributed to RxInfer, its a community project. If you have some nice ideas for the mathematical foundations of LLMs and probabilistic programming please share!

murphyk · 2025-09-07T23:45:18Z

murphyk
Sep 7, 2025

https://arxiv.org/abs/2302.06675 Vaguely related - LLM powered program search discovered muon optimizer Sent from Gmail Mobile

…

On Thu, Sep 4, 2025 at 9:49 AM Mohit ***@***.***> wrote: Hi @albertpod <https://github.com/albertpod> , I like the idea. I would like to work on Mathematical AI and LLMs or Probabilistic programming. Background - Mathemathics and Computer science, (worked on bayesian, contributed to numpyro) — Reply to this email directly, view it on GitHub <#501 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDK6EG7C2B3VC3NTCU5XET3RA7PXAVCNFSM6AAAAACFA23XCWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIMZQHAYDCMQ> . You are receiving this because you were mentioned.Message ID: ***@***.*** com>

0 replies

ReactiveBayes

LLM-Assisted Rule Derivation for RxInfer.jl #501

Uh oh!

Uh oh!

albertpod Aug 28, 2025 Maintainer

Replies: 8 comments · 6 replies

Uh oh!

ofSingularMind Aug 28, 2025 Collaborator

Uh oh!

bvdmitri Aug 28, 2025 Maintainer

Uh oh!

Uh oh!

albertpod Aug 28, 2025 Maintainer Author

Uh oh!

murphyk Aug 29, 2025

Uh oh!

bvdmitri Aug 29, 2025 Maintainer

Uh oh!

Uh oh!

Nimrais Aug 29, 2025 Maintainer

Uh oh!

bvdmitri Aug 29, 2025 Maintainer

Uh oh!

ismailsenoz Sep 2, 2025 Maintainer

Uh oh!

murphyk Sep 2, 2025

Uh oh!

albertpod Sep 3, 2025 Maintainer Author

Uh oh!

murphyk Sep 3, 2025

Uh oh!

Uh oh!

t3chw Sep 4, 2025

Uh oh!

bvdmitri Sep 5, 2025 Maintainer

Uh oh!

murphyk Sep 7, 2025

albertpod
Aug 28, 2025
Maintainer

Replies: 8 comments 6 replies

ofSingularMind
Aug 28, 2025
Collaborator

bvdmitri
Aug 28, 2025
Maintainer

albertpod Aug 28, 2025
Maintainer Author

murphyk
Aug 29, 2025

bvdmitri Aug 29, 2025
Maintainer

Nimrais Aug 29, 2025
Maintainer

bvdmitri Aug 29, 2025
Maintainer

ismailsenoz Sep 2, 2025
Maintainer

murphyk
Sep 2, 2025

albertpod
Sep 3, 2025
Maintainer Author

murphyk
Sep 3, 2025

t3chw
Sep 4, 2025

bvdmitri Sep 5, 2025
Maintainer

murphyk
Sep 7, 2025