Release v1.0.0 Base
We are thrilled to announce the v1.0.0 Base release of WavLMRawNetXSVBase—an end-to-end speaker verification system that fuses WavLM Large (micro features) with RawNetX (macro features). By learning directly from raw waveforms, the model avoids manual feature extraction (e.g., MFCC, mel-spectrogram) and achieves a compact 256-dimensional speaker embedding.
Highlights
- Micro + Macro Features: Harness both short-term acoustic nuances (WavLM) and broader temporal stats (RawNetX).
- Fully End-to-End: Minimal preprocessing; the model discovers optimal frequency and temporal patterns on its own.
- Performance: Validated on VoxCeleb1 with an EER of 4.67%.
- Flexible: Prepares the groundwork for future expansions and improvements.
Thank you for checking out this release—excited to see what you build with WavLMRawNetXSVBase! Feedback and contributions are always welcome.