Skip to content

Getting nan tensor in output #59

@santurini

Description

@santurini

Hello,
I wanted to know if it happened also to you during training to have the model outputting full nan tensors. It happens to me some times and I wanted to know if it is a problem of the model or it is a problem of my setup.
I'm currently training a tiny version of the model in order to make it enter in RAM so I had to drop some layers of the final stage and in general the number of heads, dims and etc.

EDIT:

I forgot to mention I'm training on mixed precision for memory issues

You have any idea why this can happen?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions