Define output language for multilingual nvidia/parakeet-tdt-0.6b-v3 model #14620
Replies: 2 comments
-
AI generated solution, please verify To configure the nvidia/parakeet-tdt-0.6b-v3 model to output transcriptions in a specific language like English or Portuguese when processing multilingual audio, you need to use the forced_decoder_ids parameter during generation. Here's how to do it:
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor
import torch
model_id = "nvidia/parakeet-tdt-0.6b-v3"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForSpeechSeq2Seq.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
# For English
forced_decoder_ids = processor.get_decoder_prompt_ids(language="english", task="transcribe")
# For Portuguese
# forced_decoder_ids = processor.get_decoder_prompt_ids(language="portuguese", task="transcribe")
inputs = processor(audio, return_tensors="pt").to("cuda")
transcription = model.generate(
**inputs,
forced_decoder_ids=forced_decoder_ids,
max_new_tokens=128
)
decoded_output = processor.batch_decode(transcription, skip_special_tokens=True)[0] This forces the model to output text in your specified language regardless of what languages are present in the input audio. |
Beta Was this translation helpful? Give feedback.
-
@zhenyih I got the following erro trying to execute the code snippet below: from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor
import torch
model_id = "nvidia/parakeet-tdt-0.6b-v3"
processor = AutoProcessor.from_pretrained(model_id)
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Is it possible to configure the ASR model
nvidia/parakeet-tdt-0.6b-v3
to return the transcription in a specific language?For example, if the input audio contains multiple languages, can I force the model to output the transcription in English or Portuguese only?
I am using the code provided in the repository https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3
Beta Was this translation helpful? Give feedback.
All reactions