torch_xla scan forces inputs to be differentiable

The snippet https://github.com/pytorch/xla/blob/00fac787e986a51e1d5d8b74603867cea4086477/torch_xla/experimental/scan.py#L217-L226 is probably wrong. It adds `require_grads=True` on all carry inputs and that won't work if one of the carry is a `LongTensor`.

The most obvious example is that if one of the input is an integer, then it can't possibly have gradients.

	# Make some fake tensors to trace the user function and obtain the
	# forward and backward graphs. Note that the init/carry fake tensor
	# always requires grad. That's because even if the user passed in some
	# `init` that does not require grad, we still want gradients to flow
	# through the `carry` from one iteration of the user function to the
	# next. In summary, the `carry` argument used to trace a user function
	# to get a correct backward pass always requires grad.
	def make_fake_tensor(v: torch.Tensor, requires_grad=True) -> torch.Tensor:
	return torch.empty_like(
	v, dtype=v.dtype, device=v.device, requires_grad=requires_grad)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

torch_xla scan forces inputs to be differentiable #8783

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

torch_xla scan forces inputs to be differentiable #8783

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions