Skip to content

Add derivatives for besselix, besseljx, and besselyx #350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 18, 2021

Conversation

devmotion
Copy link
Member

This PR adds derivatives for the non-holomorphic functions besselix, besseljx, and besselyx.

@codecov
Copy link

codecov bot commented Sep 29, 2021

Codecov Report

Merging #350 (a9f9f52) into master (0c41812) will increase coverage by 0.19%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #350      +/-   ##
==========================================
+ Coverage   92.75%   92.94%   +0.19%     
==========================================
  Files          12       12              
  Lines        2706     2765      +59     
==========================================
+ Hits         2510     2570      +60     
+ Misses        196      195       -1     
Flag Coverage Δ
unittests 92.94% <100.00%> (+0.19%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/chainrules.jl 100.00% <100.00%> (ø)
src/beta_inc.jl 90.37% <0.00%> (-0.31%) ⬇️
src/gamma.jl 94.90% <0.00%> (+0.08%) ⬆️
src/erf.jl 100.00% <0.00%> (+2.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c41812...a9f9f52. Read the comment docs.

@devmotion
Copy link
Member Author

Since it's the first time I implement complex derivatives of non-holomorphic functions it might be good if someone more experienced could have a look at the PR. Is there anything that could be improved @oxinabox? It seems the real(conj(..) * ...) could be optimized in the same way as ChainRules._realconjtimes. I was wondering if one could move https://github.com/JuliaDiff/ChainRules.jl/blob/master/src/rulesets/Base/utils.jl to ChainRulesCore and make realconjtimes and imagconjtimes available as part of the official API.

@oxinabox
Copy link
Contributor

oxinabox commented Oct 1, 2021

Since it's the first time I implement complex derivatives of non-holomorphic functions it might be good if someone more experienced could have a look at the PR.

@sethaxen perhaps?

@sethaxen
Copy link
Contributor

sethaxen commented Oct 1, 2021

Great! I'll review tonight or tomorrow.

@devmotion
Copy link
Member Author

I was wondering if one could move https://github.com/JuliaDiff/ChainRules.jl/blob/master/src/rulesets/Base/utils.jl to ChainRulesCore and make realconjtimes and imagconjtimes available as part of the official API.

Do you think I should open an issue or PR in ChainRulesCore regarding this question?

@sethaxen
Copy link
Contributor

sethaxen commented Oct 1, 2021

I was wondering if one could move https://github.com/JuliaDiff/ChainRules.jl/blob/master/src/rulesets/Base/utils.jl to ChainRulesCore and make realconjtimes and imagconjtimes available as part of the official API.

Do you think I should open an issue or PR in ChainRulesCore regarding this question?

Yes, I think that's a good idea.

@devmotion
Copy link
Member Author

Yes, I think that's a good idea.

I made a PR: JuliaDiff/ChainRulesCore.jl#474

Copy link
Contributor

@sethaxen sethaxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Most of my comments on besselix apply also to besselyx and besseljx.

project_x = ChainRulesCore.ProjectTo(x)
function besselix_pullback(ΔΩ)
ν̄ = ChainRulesCore.@not_implemented(BESSEL_ORDER_INFO)
a = (besselix(ν - 1, x) + besselix(ν + 1, x)) / 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can use the recurrence relations to write this as (besselix(ν ± 1, x) ± besselix(ν, x)) * ν / x, which allows to reuse Ω. It would just require some special-casing to handle x=0 gracefully.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just reused the derivatives that are used for besseli etc. They were defined in this way in ChainRules originally. When I copied them to SpecialFunctions I wondered about the motivation for choosing https://functions.wolfram.com/Bessel-TypeFunctions/BesselI/20/01/02/0003/ over https://functions.wolfram.com/Bessel-TypeFunctions/BesselI/20/01/02/0001/ or https://functions.wolfram.com/Bessel-TypeFunctions/BesselI/20/01/02/0002/. I assumed the currently used definition is simpler since it does not require to handle x = 0 in a special way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be best to use the same relations for besselix etc. as for besseli etc. and, if desired, change them to a different form in a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, and they're the same in DiffRules as well. If you like, you can keep them similar to besseli, etc for now, and a future PR could update all of the rules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree!

ν̄ = ChainRulesCore.@not_implemented(BESSEL_ORDER_INFO)
a = (besselix(ν - 1, x) + besselix(ν + 1, x)) / 2
b = - sign(real(x)) * Ω
x̄ = project_x(conj(a) * ΔΩ + real(conj(b) * ΔΩ))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more efficient to compute -sign(real(x)) * real(conj(Ω) * ΔΩ)) because you're multiplying then 2 reals instead of a real and a complex.

The @scalar_rule macro writes rules to use muladd here. @oxinabox does that make a difference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have never measured

return Ω, besselix_pullback
end

function ChainRulesCore.frule((_, _, _), ::typeof(besseljx), ν::Number, x::Number)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since everything should be identical for besseljx and besselyx, can you declare them the same with a loop and an @eval?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could but so far the style in this file is to implement everything explicitly (eg. also derivatives besselj and bessely are implemented separately) and so I sticked to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

ν̄ = ChainRulesCore.@not_implemented(BESSEL_ORDER_INFO)
a = (besseljx(ν - 1, x) - besseljx(ν + 1, x)) / 2
x̄ = if x isa Real
project_x(conj(a) * ΔΩ)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If x is real, then so is a:

Suggested change
project_x(conj(a) * ΔΩ)
project_x(a * ΔΩ)

Comment on lines 198 to 201
function ChainRulesCore.frule((_, _, _), ::typeof(besselix), ν::Number, x::Number)
Ω = besselix(ν, x)
ΔΩ = ChainRulesCore.@not_implemented(BESSEL_ORDER_INFO)
return Ω, ΔΩ
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My earlier comment might have gotten lost, but since ZeroTangent() * ::NotImplemented is a ZeroTangent, you can implement the frule in such a way that if the AD provides ZeroTangent() for Δν, then the frule does not return a NotImplemented:

Suggested change
function ChainRulesCore.frule((_, _, _), ::typeof(besselix), ν::Number, x::Number)
Ω = besselix(ν, x)
ΔΩ = ChainRulesCore.@not_implemented(BESSEL_ORDER_INFO)
return Ω, ΔΩ
function ChainRulesCore.frule((_, Δν, Δx), ::typeof(besselix), ν::Number, x::Number)
Ω = besselix(ν, x)
∂Ω_∂ν = ChainRulesCore.@not_implemented(BESSEL_ORDER_INFO)
a = (besselix- 1, x) + besselix+ 1, x)) / 2
∂Ω = muladd(a, Δx, muladd(-sign(real(x)) * real(Δx), Ω, ∂Ω_∂ν * Δν)
return Ω, ∂Ω

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, I completely forgot that ZeroTangent() * ::NotImplemented = ZeroTangent(). I'll fix the forward rules!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that currently it is not possible to test such definitions: JuliaDiff/ChainRulesCore.jl#477

Comment on lines 205 to 207
ΔΩ = if ∂Ω_∂ν_Δν isa ChainRulesCore.NotImplemented
∂Ω_∂ν_Δν
else
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this optimization is useful enough to justify the more verbose and less readable implementation. The optimization is also not performed when the derivative is defined with @scalar_rule.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't think it's necessary. Odds are when this happens it will be because a user actually tried to differentiate wrt the order, and they will instantly realize that's not possible. Ideally the compiler would realize a is unused and not compute it, but that doesn't seem to be the case.

Including it doesn't hurt though. You could always include in a comment the simpler expression so it's easier to follow. We do this in a few places in ChainRules.

Comment on lines 208 to 213
a = (besselix(ν - 1, x) + besselix(ν + 1, x)) / 2
if Δx isa Real
muladd(muladd(-sign(real(x)), Ω, a), Δx, ∂Ω_∂ν_Δν)
else
muladd(a, Δx, muladd(-sign(real(x)) * Ω, real(Δx), ∂Ω_∂ν_Δν))
end
Copy link
Member Author

@devmotion devmotion Oct 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be tested currently and requires JuliaDiff/ChainRulesCore.jl#477 (currently this branch is not reached with neither the default nor the NoTangent() tangent, and finite differencing yields different values (and errors with complex numbers) if a ZeroTangent is specified since then nu is not ignored in finite differencing). With the CRC PR all definitions are tested and tests pass locally.

@devmotion devmotion closed this Oct 3, 2021
@devmotion devmotion reopened this Oct 3, 2021
@devmotion
Copy link
Member Author

@sethaxen I think all comments should be addressed. Since JuliaDiff/ChainRulesCore.jl#477 is merged and released also the frules are tested properly now. The tests could be simplified (only one line for frule) if JuliaDiff/ChainRulesTestUtils.jl#222 is addressed; I guess this does not have to block this PR though.

Copy link
Contributor

@sethaxen sethaxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple minor tweaks to multiply reals before complexes, then this looks ready to merge.

Co-authored-by: Seth Axen <seth.axen@gmail.com>
Copy link
Contributor

@sethaxen sethaxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@devmotion
Copy link
Member Author

Bump 🙂

@devmotion
Copy link
Member Author

Some days ago I was added to JuliaMath, so I am able to merge the PR now. I'll wait until beginning of next week, and if nobody objects I will merge the PR then since it was approved by @sethaxen.

@devmotion devmotion merged commit c4b0b83 into JuliaMath:master Oct 18, 2021
@devmotion devmotion deleted the dw/besselx branch October 18, 2021 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants