-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hi y'all,
Love the paper and leaderboard setup. Arabic dialects get short shrifted so appreciate y'alls work. Just wanted to reach out because we have some more up to date multilingual models on our end if you want more ablations comparable with Seamless and Whisper:
https://catalog.ngc.nvidia.com/orgs/nvidia/teams/riva/models/canary-riva-1b <- multilingual Transformer decoder model (Nvidia version of whisper)
https://catalog.ngc.nvidia.com/orgs/nvidia/teams/riva/models/canary-riva-0-6b-turbo <- Same but higher throughput
https://catalog.ngc.nvidia.com/orgs/nvidia/teams/riva/models/parakeet-ctc-riva-1-1b-unified-ml-cs-concat <- Multilingual variation on the CTC model you used with language specific tokenizers.
https://catalog.ngc.nvidia.com/orgs/nvidia/teams/riva/models/parakeet-ctc-riva-1-1b-unified-ml-cs-universal <- Same but tokens are merged from across all languages.
No worries about which ones you want to evaluate! Just wanted to share since it looks like we didn't release them in time for your original submission.