Replies: 1 comment
-
Hi, no, currently this future is not implemented in the main branch, but there is support in PyTorch Lightning, so you have to tweek nemo exp_manager to properly handle S3 paths, and then PyTorch Lightning will do its magic. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
Our project is planning to leverage
exp_manager
(https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/core/exp_manager.html) for managing our experimental job tracker related configs, but we want to save and resume the checkpoints to S3.After going through the code for exp_manager, can I check with the NeMo team to understand that if checkpoints saving to S3 is supported? If it does, is this set at
explicit_log_dir
orexp_dir
? Would this be set atcheckpoint_callback_params
instead for MLFlowLogger?Thanks!
Beta Was this translation helpful? Give feedback.
All reactions