Skip to content

Recipes for commonvoice ASR and LID #129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 116 commits into
base: persephone-asr
Choose a base branch
from

Conversation

neillu23
Copy link

@neillu23 neillu23 commented Feb 1, 2023

No description provided.

@neillu23 neillu23 changed the title Add recipe for commonvoice tranducer and slurm configuration Recipe for commonvoice tranducer and slurm configuration Feb 1, 2023
@neillu23 neillu23 changed the title Recipe for commonvoice tranducer and slurm configuration Recipes for commonvoice ASR and LID Feb 20, 2023
Copy link
Contributor

@jesus-villalba jesus-villalba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you haven't, could you pass black on the python files you changed? you can config vscode to do it automatically each time you save, otherwise you can just run "black file_path" on each file.

@@ -69,6 +70,10 @@ def process_audio_files(
logging.info("Processing audio %s" % (key))
t2 = time.time()

if output_sampling_rate is not None:
x = signal.resample(x, int(x.shape[0]*output_sampling_rate/fs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signal resample may not be a good option, I don't know if it could affect audio quality. I used this function for the VAD because I just wanted to stretch it from frame-level to sample level vad. But I don't know if this function is good for audio. Could you check the audios you got?

@@ -0,0 +1,261 @@
#!/usr/bin/env python
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has this faile something different to wav2vec2xvector trainer?

@@ -85,7 +85,9 @@ def __init__(
else:
assert "duration" in self.seg_set


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run black on this file and other files python file you edited to remove the extra white lines

@@ -111,7 +120,7 @@ def add_class_args(parser, prefix=None):

parser.add_argument(
"--base-sampler-type",
choices=["seg_sampler", "bucketing_seg_sampler"],
choices=["seg_sampler", "bucketing_seg_sampler", "bucketing_seg_sampler","class_weighted_seg_sampler"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a repeated choice

@@ -7,10 +7,12 @@
from .vae.vae import VAE
from .vae.vq_vae import VQVAE
from .transducer import RNNTransducer, RNNRNNTransducer
from .wav2languageid import HFWav2Vec2ResNet1dLanguageID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this one?

@@ -61,7 +61,7 @@ def forward(
self,
x: torch.Tensor,
x_lengths: torch.Tensor,
y: k2.RaggedTensor,
y: Union[Dict, k2.RaggedTensor],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why dict?

@@ -0,0 +1,7 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we delete this directory?

@@ -0,0 +1,7 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would invert the name to languageid_transducer since we do first the language id and the we want to use it for the asr

@@ -0,0 +1,212 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we delete this one?

ylu125 and others added 30 commits July 4, 2023 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants