Recognising the same speaker across multiple audio clips? #5699
Unanswered
PhantomSpike
asked this question in
Q&A
Replies: 2 comments 1 reply
-
Beta Was this translation helpful? Give feedback.
1 reply
-
Hello, @PhantomSpike, You can utilize speaker verification models for this. Speaker verification verifies a person's identity from voice characteristics across multiple speakers, regardless of language. I present three NeMo speaker verification models from NVIDIA NeMo (SpeakerNet, TitaNet-L, ECAPA-TDNN) in my repository. I hope this will help you. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
Thank you to the developers for this wonderful software, really love it! <3
Quick Q:
Is it possible to recognize the same speaker across multiple sound clips/recordings?
If so, should I concatenate all the recordings together and feed them in as one file, or I can give them separately to the NeMo diarization pipeline?
If not possible, what would you recommend? I was thinking that I can use the pipeline on the sound clips, and then once I have the embeddings for each speaker I can do some sort of clustering to find the same speakers because presumably in the embedding space they would be quite similar to one another even in different recordings.
Just to give a bit more context, my particular application will be meeting calls and telephone calls, and we would not have a database of speakers, but I would want to just be able to say that the same person was present in different recordings and when they spoke (diarization).
Any advice would be greatly appreciated! Thank you!
Beta Was this translation helpful? Give feedback.
All reactions