Transducer loss leads to memory leak

Hi, I'm using rnnt-loss and pytorch-lightning to train my model. But I found the 4D tensor which is used to calculate transducer will be accumulated in GPU, when I check the GPU memory in training step(before a batch starts), there are many 4D tensor(come from the previous batches) in the GPU memory. That will lead to CUDA out of memory finally. I don't know what went wrong.
![image](https://user-images.githubusercontent.com/71253402/126932421-114318fe-5bec-4935-a7cb-449d51a4a091.png)
gpu_tracker is used to check the GPU memory.
![image](https://user-images.githubusercontent.com/71253402/126932668-91f8057d-e5c8-4990-aeff-6722d0a40d2a.png)
The loss in training step is from this.
![image](https://user-images.githubusercontent.com/71253402/126932688-7ed35d12-5e2b-4c3e-9a36-075a745e4998.png)
This is the result of GPU memory usage in training step.
I try to use 'del', 'gc.collect()' and 'torch.cuda.empty_cache()' in everywhere, but they are all useless.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Transducer loss leads to memory leak #21

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Transducer loss leads to memory leak #21

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions