Replies: 1 comment
-
ai generated, please verify To fix the hidden size mismatch error "Expected hidden[1] size (2, 4, 2048), got [2, 4, 640]" in your RNNT BPE model, you need to align the encoder's output size with what the decoder expects. In your lstm_transducer_bpe.yaml file, modify the encoder section to ensure the projection size matches the rnn_hidden_size: encoder:
_target_: nemo.collections.asr.modules.RNNEncoder
feat_in: ${model.preprocessor.features}
n_layers: 8
d_model: 2048
proj_size: 2048 # Change this from 640 to 2048
rnn_type: "lstm"
bidirectional: true This ensures the encoder outputs a hidden size of 2048 instead of 640, which will match what the decoder expects. The mismatch occurs because the encoder's output dimension (proj_size) needs to match the decoder's expected input dimension (rnn_hidden_size). |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I’m trying to train an RNNT BPE model with NeMo, but I get an error during encoder initialization.
I got this message error:
"Expected hidden[1] size (2, 4, 2048), got [2, 4, 640]".
I’m using the LSTM Transducer BPE YAML configuration file provided in NeMo as a starting point.
How I can fix it?
lstm_transducer_bpe.yaml
Beta Was this translation helpful? Give feedback.
All reactions