Replies: 2 comments
-
@nithinraok Could you please assist me with some thoughts regarding that? |
Beta Was this translation helpful? Give feedback.
0 replies
-
It is not recommended to try SSL with that size. I don;t think you would see any benefits.
However, you might not see much benefits with 300hrs of pretraining data. You could use this dataset for more hrs: https://github.com/facebookresearch/libri-light |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to run SSL (Wav2Vec-BERT) for a small speech dataset (~300 hours), for ASR task. I understand that pretraining should be implemented with big, diverse datasets, resulting in representation learning that can be fine-tuned later for downstream tasks. However, I am trying to explore the gain of SSL for an English meduim dataset of a particular domain and compare it with fine-tuning scenarios. Could you please assist me with some tips I should consider in my case? I am using the fast-conformer config following the instructions in https://github.com/NVIDIA/NeMo/tree/main/examples/asr/speech_pretraining
I also have audio of different durations varying in [1sec-100sec]. I want to make use of all the data I have since it is limited, so I kept the max_duration and min_duration to 100sec and 1sec. Can that conflict with anything in the SSL config? Should I consider small learning rates or a particular variant in terms of the model size (e.g., large vs. XLarge), etc? If I want to use Lhotse, should I consider any settings that suit the size of my dataset (~300 hours)?
I also want to try NEST for SSL. Should I have different settings than the default in the config file since I have a much smaller dataset? What factors should I consider when augmenting with the noise, given the size of the data I have? Any tips can help. Thank you.
Beta Was this translation helpful? Give feedback.
All reactions