NameError: name 'copy' is not defined when setting timestamp = True #9820
Unanswered
AbdelrhmanElnenaey
asked this question in
Q&A
Replies: 1 comment
-
the problem does not appear if i use |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am getting this error ```NameError Traceback (most recent call last)
in <cell line: 16>()
14
15 # specify flag `return_hypotheses=True``
---> 16 hypotheses = asr_model.transcribe(["/content/audio_sample_20.wav"], return_hypotheses=True)
17
18 # if hypotheses form a tuple (from RNNT), extract just "best" hypotheses
4 frames
/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py in decorate_context(*args, **kwargs)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)
116
117 return decorate_context
/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/models/rnnt_models.py in transcribe(self, paths2audio_files, batch_size, return_hypotheses, partial_hypothesis, num_workers, channel_selector, augmentor, verbose)
302 input_signal=test_batch[0].to(device), input_signal_length=test_batch[1].to(device)
303 )
--> 304 best_hyp, all_hyp = self.decoding.rnnt_decoder_predictions_tensor(
305 encoded,
306 encoded_len,
/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/submodules/rnnt_decoding.py in rnnt_decoder_predictions_tensor(self, encoder_output, encoded_lengths, return_hypotheses, partial_hypotheses)
487
488 else:
--> 489 hypotheses = self.decode_hypothesis(prediction_list) # type: List[str]
490
491 # If computing timestamps
/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/submodules/rnnt_decoding.py in decode_hypothesis(self, hypotheses_list)
1446 A list of strings.
1447 """
-> 1448 hypotheses = super().decode_hypothesis(hypotheses_list)
1449 if self.compute_langs:
1450 if isinstance(self.tokenizer, AggregateTokenizer):
/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/submodules/rnnt_decoding.py in decode_hypothesis(self, hypotheses_list)
538 # this is done so that
rnnt_decoder_predictions_tensor()
can process this hypothesis539 # in order to compute exact time stamps.
--> 540 alignments = copy.deepcopy(hypotheses_list[ind].alignments)
541 token_repetitions = [1] * len(alignments) # preserve number of repetitions per token
542 hypothesis = (prediction, alignments, token_repetitions)
NameError: name 'copy' is not defined```
when I try to set the time stamp in the decoding configuration to True
Here is the code ```# import nemo_asr and instantiate asr_model as above
import nemo.collections.asr as nemo_asr
import copy
asr_model = nemo_asr.models.ASRModel.from_pretrained("stt_en_fastconformer_transducer_large")
update decoding config to preserve alignments and compute timestamps
from omegaconf import OmegaConf, open_dict
decoding_cfg = asr_model.cfg.decoding
with open_dict(decoding_cfg):
decoding_cfg.preserve_alignments = True
decoding_cfg.compute_timestamps = True
asr_model.change_decoding_strategy(decoding_cfg)
specify flag `return_hypotheses=True``
hypotheses = asr_model.transcribe(["/content/audio_sample_20.wav"], return_hypotheses=True)
if hypotheses form a tuple (from RNNT), extract just "best" hypotheses
if type(hypotheses) == tuple and len(hypotheses) == 2:
hypotheses = hypotheses[0]
timestamp_dict = hypotheses[0].timestep # extract timesteps from hypothesis of first (and only) audio file
print("Hypothesis contains following timestep information :", list(timestamp_dict.keys()))
For a FastConformer model, you can display the word timestamps as follows:
80ms is duration of a timestep at output of the Conformer
time_stride = 8 * asr_model.cfg.preprocessor.window_stride
word_timestamps = timestamp_dict['word']
for stamp in word_timestamps:
start = stamp['start_offset'] * time_stride
end = stamp['end_offset'] * time_stride
word = stamp['char'] if 'char' in stamp else stamp['word']
Beta Was this translation helpful? Give feedback.
All reactions