everething is identical insted of perplacing pytorch LSTM to one of your implementation leads to NAN loss