Skip to content

Moving GNNGraph from GPU to CPU breaks Zygote gradient computation #597

@ivanightingale

Description

@ivanightingale
using Graphs
using GraphNeuralNetworks
using Flux
using CUDA

device = gpu_device()
g = GNNGraph(path_graph(3); ndata=(; x=rand(Float32, 2, 3))) |> device
model = Dense(2 => 1) |> device
grad = Flux.gradient(model) do model
    out = model(g.ndata.x)
    g_cpu = g |> cpu
    sum(out)
end

results in

LoadError: Need an adjoint for constructor Base.Generator{Tuple{}, Functors.var"#18#20"{Symbol}}. Gradient is of type Tuple{}

together with 112 steps of stacktrace that points to |> cpu.

I believe that in the more general context of Flux.jl usage, moving data from GPU to CPU while computing gradient is ok most of the time. So if I didn't do something wrong here, maybe a potential fix can be considered.

EDIT: I'm trying to see if Flux.@ignore_derivatives is a workaround.

Versions:

  • Julia: v1.11.3
  • GraphNeuralNetworks.jl: v1.0.0
  • Flux.jl: v0.16.3
  • Zygote.jl: v0.7.6
  • CUDA.jl: v5.7.2

p.s. Motivation: I have an application that needs calling getgraph, which currently can only be done on CPU: (see #161).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions