Skip to content

Mooncake less efficient than Zygote or Enzyme on Flux layers #466

@m-laprise

Description

@m-laprise

(This is a follow-up to a slack thread, cc @willtebbutt & @gdalle )

I have been comparing performance of different autodiff backends for training Flux models. I get one order of magnitude worse performance from Mooncake compared to Enzyme and even Zygote in most cases. This may be due to the Fluxperimental Moonduo implementation (https://github.com/FluxML/Fluxperimental.jl/blob/master/ext/FluxMooncakeExt.jl) rather than something with Mooncake? here is a MWE if useful.

Simple case of two standard feed forward layers. Benchmarks below with Flux v0.15.2, Zygote v0.6.75 (constrained from updating further by Fluxperimental), Enzyme v0.13.30, Mooncake v0.4.83, on Julia 1.10.5.

using Flux
using BenchmarkTools

# Create random inputs and targets
const MINIBATCHSIZE = 64
X = rand(Float32, 100, MINIBATCHSIZE)
Y = rand(Float32, 20, MINIBATCHSIZE)

# Create trivial Flux NN and loss 
model = Chain(Dense(100, 50, relu), 
              Dense(50, 20))
myloss(m, x, y) = Flux.mse(m(x), y)

# Compare time to first gradient (restarting the session for each example):

using Zygote
@btime loss, grads = Flux.withgradient($myloss, $model, $X, $Y)
# 81.875 μs (87 allocations: 126.46 KiB)

using Enzyme
@btime loss, grads = Flux.withgradient($myloss, $Duplicated(model), $X, $Y)
# 82.875 μs (129 allocations: 84.27 KiB)

using Fluxperimental, Mooncake
@btime loss, grads = Flux.withgradient($myloss, $Moonduo(model), $Moonduo(X), $Moonduo(Y))
# 837.625 μs (16045 allocations: 1.89 MiB)

fclosure(m) = myloss(m, X, Y)
@btime loss, grads = Flux.withgradient($fclosure, $Moonduo(model))
# 919.000 μs (16048 allocations: 1.89 MiB)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancement (performance)Would reduce the time it takes to run some bit of the code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions