This repository was archived by the owner on Oct 9, 2023. It is now read-only.
Too much RAM usage by ImageClassificationData #1442
Unanswered
Hravan
asked this question in
Data / pipelines
Replies: 1 comment 2 replies
-
Hi @Hravan thanks for reporting this! That definitely seems like a bug. We have made several improvements in the latest release (out today) that should reduce the memory consumption of Flash. Could you try upgrading to Hope that helps 😃 |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm setting up a training for this kaggle competition dataset: https://www.kaggle.com/competitions/plant-pathology-2021-fgvc8
(I'm using here only samples with single labels to make the problem simpler)
The problem is that the ImageClassificationData takes too much RAM and GPU is underutilized. I wrote the code in plain PyTorch for comparison to confirm that the problem is somewhere within ImageClassificationData.
Code shared by both training versions:
Training in plain PyTorch:
Training in Lightning Flash:
When I increase bach_size to 64 or num_workers to 16 in ImageClassificationData, I start having problems with RAM, which does not happen for the plain PyTorch version. Any ideas what might be the problem? I tried profiling, but didn't get to any sensible conclusion, except that I bet the problem is in BaseDataFetcher in DataModule.
Beta Was this translation helpful? Give feedback.
All reactions