Release v1.0.0 Base

We are thrilled to announce the v1.0.0 Base release of WavLMRawNetXSVBase—an end-to-end speaker verification system that fuses WavLM Large (micro features) with RawNetX (macro features). By learning directly from raw waveforms, the model avoids manual feature extraction (e.g., MFCC, mel-spectrogram) and achieves a compact 256-dimensional speaker embedding.

Highlights

Micro + Macro Features: Harness both short-term acoustic nuances (WavLM) and broader temporal stats (RawNetX).
Fully End-to-End: Minimal preprocessing; the model discovers optimal frequency and temporal patterns on its own.
Performance: Validated on VoxCeleb1 with an EER of 4.67%.
Flexible: Prepares the groundwork for future expansions and improvements.

Thank you for checking out this release—excited to see what you build with WavLMRawNetXSVBase! Feedback and contributions are always welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.0.0 Base