Processing Dataset of 30Mb makes Google Colab's 13Gb RAM run out of memory #4485
Unanswered
Koenlaermans
asked this question in
Questions
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to fine tune a deepset QA reader from hugging face on the CUAD dataset (a QA dataset of about 20k questions). I'm only using 3500 questions right now.
With only these 3 lines of code I'm trying to fine tune the reader on this dataset. The json file in squad format is about 27.5Mb. Still the 12 Gb System-RAM of my google colab runtime is running out of memory while preprocessing the dataset (not even training).
The problem is not the GPU but System RAM. I don't know why this is or how I would fix this. It doesn't seem logical.
Link to the dataset: CUAD Dataset 3500 questions
Here is the code.
Beta Was this translation helpful? Give feedback.
All reactions