Using iterative solver inside custom_jvp rule causes error when jax.grad/jax.jacrev is called #30249

ubaidali · 2025-07-16T10:34:17Z

ubaidali
Jul 16, 2025

Summary:
I am trying to solve a root finding problem f(x, params) == 0 using Newton-Raphson and I want to work out the gradients dx/dparams using the Implicit Function theorem. The real problem is very large and I have to use an iterative solver to update the solution inside the Newton loop. I define a custom_vjp to work out the gradients efficiently and use an iterative solver inside the custom_vjp function. When I try and obtain the gradients, jacfwd works but jacrev and grad produce errors. Any reason for this? I thought custom_jvp was the preferred choice that allows both forward-mode and reverse-mode differentiation.

Details of toy implementation:

def f(x, params):
	x, y, z = x
	a, b, c = params
	f1 = a*x**2 + b*y**2 + c*z**2 - 3
	f2 = a*x**2 + b*y**2 - c*z    - 1
	f3 = a*x    + b*y    + c*z    - 3
	return jnp.array([f1, f2, f3])

x = jnp.array([0.2, 0.3, 0.5]) # initial guess

@jax.custom_jvp
def nsolve(xi, params):
	err = 0.1
	tol = 1e-6
	iter = 0

	def Ffunc(x):
		return f(x, params)

	def dFfunc(x,dx):
		r, dr = jax.jvp(Ffunc, (x,), (dx,))
		return dr

	while err > tol:
		rhs = f(xi, params)
		# Jmat = jax.jacfwd(f,argnums=0)(xi, params)
		update = jax.scipy.sparse.linalg.gmres(lambda x: dFfunc(xi,x),-rhs)[0]
		xi = xi + update
		err = jnp.linalg.norm(update)
		iter = iter+1
		jax.debug.print("err={}",err)

	return xi

I define a custom_jvp that uses the implicit function theorem to solve a linear system for the converged solution x

@nsolve.defjvp
def nsolve_jvp(primals, tangents):
	xi, params, = primals
	dxi, dparams, = tangents

	xf = nsolve(xi, params)

	def Ffunc(x):
		return f(x, params)

	Jmat = jax.jacfwd(Ffunc)(xf)

	def dFfunc(x,dx):
		# r, dr = jax.jvp(Ffunc, (x,), (dx,))
		Jmat = jax.jacfwd(Ffunc)(x)
		dr = Jmat @ dx
		return dr

	def Floc(p):
		return f(xf, p)


	res0, jvp_res0 = jax.jvp(Floc,(params,),(dparams,))
	# tout = jax.scipy.sparse.linalg.gmres(lambda x: dFfunc(xf,x),-jvp_res0)[0] #error
	tout = jax.scipy.sparse.linalg.gmres(Jmat, -jvp_res0)[0] #error 
	# tout = jnp.linalg.solve(Jmat,-jvp_res0) #[works]
	primal_out = xf
	tangent_out = tout

	return (primal_out, tangent_out)

Calling jax.jacfwd(nsolve,argnums=1)(x,p) works,
but calling jax.jacrev(nsolve,argnums=1)(x,p) does not work, and produces an error:

Traceback (most recent call last):
File "../examples/test_newton_reverse.py", line 79, in
dxdp = jax.jacrev(nsolve,argnums=1)(x,p)
File "../examples/test_newton_reverse.py", line 62, in nsolve_jvp
tout = jax.scipy.sparse.linalg.gmres(Jmat, -jvp_res0)[0]
File "../envs/lib/python3.10/site-packages/jax/_src/scipy/sparse/linalg.py", line 700, in gmres
x = lax.custom_linear_solve(A, b, solve=_solve, transpose_solve=_solve)
jax._src.source_info_util.JaxStackTraceBeforeTransformation: TypeError: Value UndefinedPrimal(ShapedArray(float32[3])) with type <class 'jax._src.interpreters.ad.UndefinedPrimal'> is not a valid JAX type

Answered by guy-singer

Jul 17, 2025

Ah, I think you're right about that, because I see now that the issue you're encountering is a known bug in JAX where custom_jvp with gmres fails in reverse-mode differentiation. This is documented in issue #5309, which shows the exact same error: TypeError: Value UndefinedPrimal(ShapedArray(float32[3])) with type <class 'jax._src.interpreters.ad.UndefinedPrimal'> is not a valid JAX type.

custom_jvp should support both forward and reverse mode through automatic transposition, but it looks like there's a specific bug with iterative solvers like gmres. Noted in the issue: "both forward and reverse-mode work if we replace GMRES with something like np.linalg.solve".

So I think the key differe…

View full answer

guy-singer · 2025-07-17T11:32:44Z

guy-singer
Jul 17, 2025

You're encountering this error because you're using custom_jvp but trying to compute reverse-mode gradients with jacrev. custom_jvp only defines forward-mode differentiation rules, but reverse-mode autodiff requires a custom_vjp rule instead.

When you call jacrev, JAX internally uses reverse-mode differentiation, which needs to know how to compute vector-Jacobian products (VJPs), not Jacobian-vector products (JVPs). The UndefinedPrimal error occurs because the reverse-mode transformation doesn't know how to handle your custom JVP rule.

Replace your custom_jvp with custom_vjp. The VJP rule should implement the transpose of your JVP operation. For your implicit function theorem case, this looks like:

@jax.custom_vjp
def nsolve(xi, params):
    # Your existing solver code
    err = 0.1
    tol = 1e-6
    iter = 0

    def Ffunc(x):
        return f(x, params)

    def dFfunc(x,dx):
        r, dr = jax.jvp(Ffunc, (x,), (dx,))
        return dr

    while err > tol:
        rhs = f(xi, params)
        update = jax.scipy.sparse.linalg.gmres(lambda x: dFfunc(xi,x),-rhs)[0]
        xi = xi + update
        err = jnp.linalg.norm(update)
        iter = iter+1

    return xi

def nsolve_fwd(xi, params):
    xf = nsolve(xi, params)
    return xf, (xf, params)

def nsolve_bwd(res, g):
    xf, params = res
    
    def Ffunc(x):
        return f(x, params)
    
    Jmat = jax.jacfwd(Ffunc)(xf)
    
    # Solve J^T @ lambda = g for lambda
    lambda_val = jnp.linalg.solve(Jmat.T, g)
    
    # Compute VJP w.r.t. params
    def Floc(p):
        return f(xf, p)
    
    _, vjp_fun = jax.vjp(Floc, params)
    g_params = vjp_fun(-lambda_val)[0]
    
    # VJP w.r.t. initial guess is zero for converged solution
    g_xi = jnp.zeros_like(xi)
    
    return (g_xi, g_params)

nsolve.defvjp(nsolve_fwd, nsolve_bwd)

For the implicit function theorem, if you have F(x, p) = 0 and want dx/dp, the VJP with cotangent g involves solving the transposed system J^T @ lambda = g where J is the Jacobian of F w.r.t. x, then computing the VJP of F w.r.t. parameters.

If you want to support both forward and reverse mode efficiently, you could define both custom_jvp and custom_vjp for the same function, but custom_vjp alone will work for both jacfwd and jacrev since JAX can automatically convert between the two when needed.

The reason your original code worked with jacfwd is that it directly used your JVP rule, but jacrev needs the corresponding VJP rule which you hadn't defined.

0 replies

ubaidali · 2025-07-17T12:36:45Z

ubaidali
Jul 17, 2025
Author

Thanks for the getting back to me. The JAX documentation says that custom_jvp works for both forward and backward---

"Even though we defined only a JVP rule and no VJP rule, we can use both forward- and reverse-mode differentiation on f. JAX will automatically transpose the linear computation on tangent values from our custom JVP rule, computing the VJP as efficiently as if we had written the rule by hand.."

https://docs.jax.dev/en/latest/notebooks/Custom_derivative_rules_for_Python_code.html

Also, in my example, reverse mode does not work only if I calculate the tangent vector with an iterative solver, but works if I use a direct matrix inversion
tout = jax.scipy.sparse.linalg.gmres(Jmat, -jvp_res0)[0] [error]
# tout = jnp.linalg.solve(Jmat,-jvp_res0) [works]

2 replies

guy-singer Jul 17, 2025

Ah, I think you're right about that, because I see now that the issue you're encountering is a known bug in JAX where custom_jvp with gmres fails in reverse-mode differentiation. This is documented in issue #5309, which shows the exact same error: TypeError: Value UndefinedPrimal(ShapedArray(float32[3])) with type <class 'jax._src.interpreters.ad.UndefinedPrimal'> is not a valid JAX type.

custom_jvp should support both forward and reverse mode through automatic transposition, but it looks like there's a specific bug with iterative solvers like gmres. Noted in the issue: "both forward and reverse-mode work if we replace GMRES with something like np.linalg.solve".

So I think the key difference between your two approaches:

jnp.linalg.solve(Jmat, -jvp_res0) - This works because JAX can automatically transpose direct linear solvers
jax.scipy.sparse.linalg.gmres(Jmat, -jvp_res0) - This fails due to a bug in JAX's automatic transposition system when handling iterative solvers

The UndefinedPrimal error occurs because during the automatic transposition process, JAX encounters the gmres operation and cannot properly generate the transpose rule for reverse-mode differentiation. This is a limitation in JAX's current implementation, not a fundamental issue with automatic transposition.

Options for workarounds:

Use direct solvers when possible (as you discovered works)
Use custom_vjp instead with explicit VJP rules
Use JAX's custom_linear_solve which is designed specifically for this case:

def solve_with_gmres(A, b):
    return jax.scipy.sparse.linalg.gmres(A, b)[0]

# In your JVP rule:
tout = jax.lax.custom_linear_solve(
    Jmat, 
    -jvp_res0, 
    solve=solve_with_gmres,
    transpose_solve=solve_with_gmres  # gmres works for transpose too
)

Answer selected by ubaidali

ubaidali Jul 17, 2025
Author

Amazing, thank you! I can't believe I missed this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Using iterative solver inside custom_jvp rule causes error when jax.grad/jax.jacrev is called #30249

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Using iterative solver inside custom_jvp rule causes error when jax.grad/jax.jacrev is called #30249

Uh oh!

Uh oh!

ubaidali Jul 16, 2025

Replies: 2 comments · 2 replies

Uh oh!

guy-singer Jul 17, 2025

Uh oh!

ubaidali Jul 17, 2025 Author

Uh oh!

guy-singer Jul 17, 2025

Uh oh!

ubaidali Jul 17, 2025 Author

ubaidali
Jul 16, 2025

Replies: 2 comments 2 replies

guy-singer
Jul 17, 2025

ubaidali
Jul 17, 2025
Author

ubaidali Jul 17, 2025
Author