Skip to content

Memory error in grover embedding #20

@marcostorrework

Description

@marcostorrework

Describe the bug
ZairaChem fails when trying to fit a model with an input file of 37637 rows.
It fails because of running out of memory when calculating the Grover Embedding (model eos7w6n).
This happens on a laptop with 16 GB RAM.

To Reproduce
Steps to reproduce the behavior:

  1. Use a computer with 16 GB RAM
  2. Download the attached example file "train.csv"
  3. Create an empty directory "model"
  4. conda activate zairachem
  5. zairachem fit -i train.csv -m model

Expected behavior
ZairaChem is supposed to fit a model and end without errors

Screenshots
Log with error attached

Desktop (please complete the following information):

  • OS: Ubuntu 20.04.5 LTS (running under Windows WLS)
  • Using ZairaChem version 0.0.1, installed in December 2022

Additional context
Running on a laptop with 16 GB RAM

Log with error:
fit_20221220_0431.log
Example input file to reproduce:
train.csv

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions