Release v0.7.2 · awslabs/sagemaker-debugger

Experimental support for TF 2.x GradientTape - Introducing experimental support for TF 2.x training scripts using GradientTape. With this change, weights, bias, loss, metrics, and gradients are captured by SageMaker Debugger. These changes work with vanilla version of Tensorflow 2.x (not with the zero-code change version) #186

Note: Training scripts using GradientTape for higher-order gradients or multiple tapes are not supported.
Distributed training scripts that use GradientTape, are not supported at this time.
Support SyncOnReadVariable in mirrored strategy - Fixes a bug that occurred because SyncOnRead distributed variable was not supported with smdebug. Also enables the use of smdebug with training scripts using TF 2.x MirroredStrategy with fit() API. #190
Turn off hook and write only from one worker for unsupported distributed training techniques – PyTorch users were observing a crash when distributed training was implemented using generic multiprocessing library, which is not a method supported by smdebug. This fix handles this case and ensures that tensors are saved. #167
Bug fix: Pytorch: Register only if tensors require gradients – Users were observing a crash when training with pretrained embeddings which does not need gradient updates. This fix checks if a gradient update is required and registers a backward hook only in those cases. #193

Provide feedback

No results found