Example code, librosa.util.exceptions.ParameterError: Audio data must be floating-point #4978

vaughantnrc · 2022-09-21T20:28:03Z

vaughantnrc
Sep 21, 2022

Hi everyone, I searched for exception "Audio data must be floating-point" but found no results.

I'm trying to run code that is more or less identical to this: https://github.com/NVIDIA/NeMo/tree/main/examples/speaker_tasks/diarization

Except with a .wav file. When I do so I get the following exception:

Traceback (most recent call last):
  File "**********\venv\lib\site-packages\nemo\collections\asr\parts\utils\decoder_timestamps_utils.py", line 661, in run_ASR_BPE_CTC
    hyp, greedy_predictions_list, log_prob = get_wer_feat_logit(
  File "**********\venv\lib\site-packages\nemo\collections\asr\parts\utils\decoder_timestamps_utils.py", line 221, in get_wer_feat_logit
    asr.read_audio_file_and_return(audio_file_path, delay, model_stride_in_secs)
  File "**********\venv\lib\site-packages\nemo\collections\asr\parts\utils\decoder_timestamps_utils.py", line 264, in read_audio_file_and_return
    samples = get_samples(audio_filepath)
  File "**********\venv\lib\site-packages\nemo\collections\asr\parts\utils\decoder_timestamps_utils.py", line 235, in get_samples
    samples = librosa.core.resample(samples, sample_rate, target_sr)
  File "**********\venv\lib\site-packages\librosa\util\decorators.py", line 104, in inner_f
    return f(**kwargs)
  File "**********\venv\lib\site-packages\librosa\core\audio.py", line 606, in resample
    util.valid_audio(y, mono=False)
  File "**********\venv\lib\site-packages\librosa\util\decorators.py", line 88, in inner_f
    return f(*args, **kwargs)
  File "**********\venv\lib\site-packages\librosa\util\utils.py", line 275, in valid_audio
    raise ParameterError("Audio data must be floating-point")
librosa.util.exceptions.ParameterError: Audio data must be floating-point

When I look at NeMo's source code, I see that the audio is explicitly loaded with type int16: https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/parts/utils/decoder_timestamps_utils.py#L231

What confuses me is that if I look at the history of librosa, I see that the check for integer types has existed for years: https://github.com/librosa/librosa/blob/main/librosa/core/audio.py

A quick glance at the functions in the stack did not turn up any conversion from int16 to float.

So my question is... Is this a bug? The code looks wrong to me, but I have trouble imagining that this issue has existed without being detected, especially when there is an example (linked above)?

In case it's pertinent:

I'm using NeMo 1.11.0 pip install nemo-toolkit[asr]==1.11.0
I appear to be on the latest version of librosa, which is version 0.9.2
I'm on Windows (though based on the observations/code above I do not see any obvious link operating system)

titu1994 · 2022-09-21T21:35:38Z

titu1994
Sep 21, 2022
Maintainer

@nithinraok possibly a bug?

0 replies

nithinraok · 2022-09-21T21:52:53Z

nithinraok
Sep 21, 2022
Maintainer

We don't have test cases to test diarization on samples not sampled at 16kHz, that was why it wasn't caught. Thanks for reporting. May be @tango4j can explain why this was done to load samples explicitly as int16.

0 replies

tango4j · 2022-09-26T08:29:21Z

tango4j
Sep 26, 2022
Collaborator

@vaughantnrc Can you share the specification of the audio file that caused this error? type of wave file, sampling rate, bit depth etc. Preferably sox --i abc.wav output. We may need to change the get_samples function for better generalizability.

0 replies

vaughantnrc · 2022-09-26T14:44:30Z

vaughantnrc
Sep 26, 2022
Author

Thank you for looking into this. This is what I get with sox:

Input File     : 'sim.wav'
Channels       : 1
Sample Rate    : 48000
Precision      : 16-bit
Duration       : 00:02:40.30 = 7694400 samples ~ 12022.5 CDDA sectors
File Size      : 15.4M
Bit Rate       : 768k
Sample Encoding: 16-bit Signed Integer PCM

1 reply

tango4j Oct 12, 2022
Collaborator

@vaughantnrc Sorry for the delayed response. We are working on a PullRequest for fixing this issue. Feel free comment on this PR if you think it is necessary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Example code, librosa.util.exceptions.ParameterError: Audio data must be floating-point #4978

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Example code, librosa.util.exceptions.ParameterError: Audio data must be floating-point #4978

Uh oh!

vaughantnrc Sep 21, 2022

Replies: 4 comments · 1 reply

Uh oh!

titu1994 Sep 21, 2022 Maintainer

Uh oh!

nithinraok Sep 21, 2022 Maintainer

Uh oh!

Uh oh!

tango4j Sep 26, 2022 Collaborator

Uh oh!

vaughantnrc Sep 26, 2022 Author

Uh oh!

Uh oh!

tango4j Oct 12, 2022 Collaborator

vaughantnrc
Sep 21, 2022

Replies: 4 comments 1 reply

titu1994
Sep 21, 2022
Maintainer

nithinraok
Sep 21, 2022
Maintainer

tango4j
Sep 26, 2022
Collaborator

vaughantnrc
Sep 26, 2022
Author

tango4j Oct 12, 2022
Collaborator