-
Notifications
You must be signed in to change notification settings - Fork 299
Open
Description
Hi team,
Thank you for your work on this project! While running the model using the following command from the repository:
CUDA_VISIBLE_DEVICES=0 python3 main.py --gin_config_file=configs/ml-1m/hstu-sampled-softmax-n128-large-final.gin --master_port=12345I get the following warning from PyTorch:
Skipping init for ....
/path/to/python3.10/site-packages/torch/autograd/graph.py:824: UserWarning:
fbgemm::dense_to_jagged: an autograd kernel was not registered to the Autograd key(s)
but we are trying to backprop through it. This may lead to silently incorrect behavior.
This behavior is deprecated and will be removed in a future version of PyTorch.
If your operator is differentiable, please ensure you have registered an autograd kernel
to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd).
If your operator is not differentiable, or to squash this warning and use the previous behavior,
please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd.
I also get a similar warning for fbgemm::jagged_to_padded_dense.
Environment
- Python version: 3.10
- OS: Ubuntu 22.04
After starting the training command, I notice that the script shows no progress information for about 20 minutes (no logs or console output). The only output I see is the warning above. I am not sure whether this is expected behavior or a sign of a potential bottleneck (e.g. data loading, model initialization, or blocking). Any guidance on whether this is expected — or suggestions to add logging to track progress — would be very helpful.
Metadata
Metadata
Assignees
Labels
No labels