Thank you for your work!
I tried reproducing the results on your paper. Anytime I run the code it keeps returning train loss values as nan for the resnet-50 model trained on the Imagenet dataset. All the default parameters as provided in the code are still maintained. Please could you help on this?
Thank you once again!