Skip to content

Bump DifferentiationInterface to 0.7 #922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 19, 2025
Merged

Bump DifferentiationInterface to 0.7 #922

merged 6 commits into from
May 19, 2025

Conversation

penelopeysm
Copy link
Member

@penelopeysm penelopeysm commented May 16, 2025

DifferentiationInterface 0.7 enables a strictness check by default (see JuliaDiff/DifferentiationInterface.jl#799). This introduces a requirement that the function being differentiated must have the same signature at the preparation and execution stage (i.e., prepare_gradient and value_and_gradient are called with the same function signature).

This broke some code inside DynamicPPL, specifically, when the function being differentiated was a closure.

To illustrate, when use_closure(adtype) is true, the function being prepared here is an anonymous function of the form DynamicPPL.var"#1#2"{Model, VarInfo, Context}.

if use_closure(adtype)
prep = DI.prepare_gradient(
x -> logdensity_at(x, model, varinfo, context), adtype, x
)

This preparation later gets used here, and there are two problems:

function LogDensityProblems.logdensity_and_gradient(
f::LogDensityFunction{M,V,C,AD}, x::AbstractVector
) where {M,V,C,AD<:ADTypes.AbstractADType}
f.prep === nothing &&
error("Gradient preparation not available; this should not happen")
x = map(identity, x) # Concretise type
# Make branching statically inferrable, i.e. type-stable (even if the two
# branches happen to return different types)
return if use_closure(f.adtype)
DI.value_and_gradient(
x -> logdensity_at(x, f.model, f.varinfo, f.context), f.prep, f.adtype, x
)
else

Firstly, this closure must close over the LogDensityFunction in the argument to logdensity_and_gradient, so it gets a type that looks like DynamicPPL.var"#3#4"{LogDensityFunction}. (This can be seen in the failing CI runs.)

But secondly, even if you were to extricate and construct this closure outside of logdensity_and_gradient, it would still get a different gensym'd name, so its type would be something like DynamicPPL.var"#5#6"{Model, VarInfo, Context}, which still does not match the type at preparation.

This PR resolves this issue by constructing the closure itself at preparation time, and then storing it inside the LogDensityFunction, so that it doesn't need to be reconstructed. This also enables DynamicPPL to work with DI 0.7, without having to disable the strictness check.

Copy link
Contributor

github-actions bot commented May 16, 2025

Benchmark Report for Commit 7d75aa0

Computer Information

Julia Version 1.11.5
Commit 760b2e5b739 (2025-04-14 06:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × AMD EPYC 7763 64-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Benchmark Results

|                 Model | Dimension |  AD Backend |      VarInfo Type | Linked | Eval Time / Ref Time | AD Time / Eval Time |
|-----------------------|-----------|-------------|-------------------|--------|----------------------|---------------------|
| Simple assume observe |         1 | forwarddiff |             typed |  false |                  9.9 |                 1.5 |
|           Smorgasbord |       201 | forwarddiff |             typed |  false |                634.2 |                42.5 |
|           Smorgasbord |       201 | forwarddiff | simple_namedtuple |   true |                421.0 |                48.3 |
|           Smorgasbord |       201 | forwarddiff |           untyped |   true |               1205.5 |                29.0 |
|           Smorgasbord |       201 | forwarddiff |       simple_dict |   true |               3575.0 |                22.9 |
|           Smorgasbord |       201 | reversediff |             typed |   true |               1469.3 |                29.6 |
|           Smorgasbord |       201 |    mooncake |             typed |   true |                967.6 |                 5.2 |
|    Loop univariate 1k |      1000 |    mooncake |             typed |   true |               5513.5 |                 4.0 |
|       Multivariate 1k |      1000 |    mooncake |             typed |   true |               1061.4 |                 8.6 |
|   Loop univariate 10k |     10000 |    mooncake |             typed |   true |              60158.5 |                 3.7 |
|      Multivariate 10k |     10000 |    mooncake |             typed |   true |               9236.8 |                 9.3 |
|               Dynamic |        10 |    mooncake |             typed |   true |                144.5 |                12.4 |
|              Submodel |         1 |    mooncake |             typed |   true |                 13.6 |                 6.5 |
|                   LDA |        12 | reversediff |             typed |   true |                522.7 |                 5.1 |

Copy link

codecov bot commented May 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.11%. Comparing base (27d4378) to head (7d75aa0).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #922   +/-   ##
=======================================
  Coverage   85.10%   85.11%           
=======================================
  Files          36       36           
  Lines        3954     3956    +2     
=======================================
+ Hits         3365     3367    +2     
  Misses        589      589           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coveralls
Copy link

coveralls commented May 16, 2025

Pull Request Test Coverage Report for Build 15112459001

Details

  • 0 of 4 (0.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.007%) to 85.197%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/logdensityfunction.jl 0 4 0.0%
Totals Coverage Status
Change from base Build 15110064199: 0.007%
Covered Lines: 3367
Relevant Lines: 3952

💛 - Coveralls

@penelopeysm penelopeysm changed the title DI 0.7 Bump DifferentiationInterface to 0.7 May 18, 2025
@penelopeysm penelopeysm marked this pull request as ready for review May 18, 2025 00:19
Copy link
Contributor

DynamicPPL.jl documentation for PR #922 is available at:
https://TuringLang.github.io/DynamicPPL.jl/previews/PR922/

@gdalle
Copy link
Contributor

gdalle commented May 18, 2025

Nice catch! Hope the debugging wasn't too much trouble, I really think this is a welcome addition to DI

@gdalle
Copy link
Contributor

gdalle commented May 18, 2025

If it is any easier, an alternative would be to replace the closure with a callable struct, where you control what is closed over. With only one argument this would be Base.Fix2, with more it can look like
https://github.com/JuliaDiff/DifferentiationInterface.jl/blob/6a58124902a9cc3e623014db28eacd4b56ba4434/DifferentiationInterface/src/utils/context.jl#L160-L170

@penelopeysm
Copy link
Member Author

penelopeysm commented May 19, 2025

@gdalle Thanks for the suggestion! I benchmarked it and the callable struct is actually a bit faster, so that's a win.

(Results)

results_closure.csv
results_struct.csv

(To reproduce, 6f670c4 is the commit which introduces the struct - run this before and after this commit)

using DynamicPPL, Distributions, ForwardDiff, ReverseDiff, Mooncake, Enzyme, ADTypes, DataFrames
using DynamicPPL.TestUtils.AD: run_ad, ADResult

ADTYPES = Dict(
    "ForwardDiff" => AutoForwardDiff(),
    "ReverseDiff" => AutoReverseDiff(; compile=false),
    "ReverseDiffCompiled" => AutoReverseDiff(; compile=true),
    "Mooncake" => AutoMooncake(; config=nothing),
    "EnzymeForward" => AutoEnzyme(; mode=set_runtime_activity(Forward, true)),
    "EnzymeReverse" => AutoEnzyme(; mode=set_runtime_activity(Reverse, true)),
)
MODELS = DynamicPPL.TestUtils.DEMO_MODELS

RESULTS = []
for model in MODELS
    for (adtype_name, adtype) in ADTYPES
        result = run_ad(model, adtype; benchmark=true)
        @show result.time_vs_primal
        push!(RESULTS, (model="$(model.f)", adtype=adtype_name, time=result.time_vs_primal))
    end
end
RESULT_DF = DataFrame(RESULTS)
using CSV
CSV.write("results.csv", RESULT_DF)

@penelopeysm penelopeysm requested a review from sunxd3 May 19, 2025 11:13
Copy link
Member

@sunxd3 sunxd3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good, no comments

@penelopeysm penelopeysm added this pull request to the merge queue May 19, 2025
@penelopeysm penelopeysm removed this pull request from the merge queue due to a manual request May 19, 2025
@penelopeysm penelopeysm added this pull request to the merge queue May 19, 2025
@gdalle
Copy link
Contributor

gdalle commented May 19, 2025

Do the type parameters increase performance even more, by removing type instabilities?

@penelopeysm
Copy link
Member Author

Nope, it's pretty much the same as before (t(struct with parameters) / t(struct without parameters) = 1.004 ± 0.094)

Merged via the queue into main with commit 072234d May 19, 2025
17 of 18 checks passed
@penelopeysm penelopeysm deleted the py/di-0.7 branch May 19, 2025 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants