DiariZen

DiariZen is a speaker diarization toolkit driven by AudioZen and Pyannote 3.1.

Installation

# create virtual python environment
conda create --name diarizen python=3.10
conda activate diarizen

# install diarizen 
conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt && pip install -e .

# install pyannote-audio
cd pyannote-audio && pip install -e .[dev,testing]

# install dscore
git submodule init
git submodule update

Usage

For model training, see recipes/diar_ssl/run_stage.sh.
For model pruning, see recipes/diar_ssl_pruning/run_stage.sh.
For inference, our model supports for Hugging Face 🤗. See below:

from diarizen.pipelines.inference import DiariZenPipeline

# load pre-trained model
diar_pipeline = DiariZenPipeline.from_pretrained("BUT-FIT/diarizen-wavlm-large-s80-md")

# apply diarization pipeline
diar_results = diar_pipeline('./example/EN2002a_30s.wav')

# print results
for turn, _, speaker in diar_results.itertracks(yield_label=True):
    print(f"start={turn.start:.1f}s stop={turn.end:.1f}s speaker_{speaker}")
# start=0.0s stop=2.7s speaker_0
# start=0.8s stop=13.6s speaker_3
# start=5.8s stop=6.4s speaker_0
# ...

# load pre-trained model and save RTTM result
diar_pipeline = DiariZenPipeline.from_pretrained(
        "BUT-FIT/diarizen-wavlm-large-s80-md",
        rttm_out_dir='.'
)
# apply diarization pipeline
diar_results = diar_pipeline('./example/EN2002a_30s.wav', sess_name='EN2002a')

Benchmark

We train DiariZen models on a compound dataset composed of the datasets listed in the table below, followed by structured pruning to remove redundant parameters. For the results below:

AISHELL-4 was converted to mono using sox in.wav -c 1 out.wav.
NOTSOFAR-1 contains only single-channel recordings, e.g. sc_plaza_0, sc_rockfall_0.
Diarization Error Rate (DER) is evaluated without applying a collar.
No domain adaptation is applied to any individual dataset.
All experiments use the same clustering hyperparameters across datasets.

Dataset	Pyannote v3.1	DiariZen-Base-s80	DiariZen-Large-s80
AMI-SDM	22.4	15.8	14.0
AISHELL-4	12.2	10.7	9.8
AliMeeting far	24.4	14.1	12.5
NOTSOFAR-1	-	20.3	17.9
MSDWild	25.3	17.4	15.6
DIHARD3 full	21.7	15.9	14.5
RAMC	22.2	11.4	11.0
VoxConverse	11.3	9.7	9.2

Updates

2025-06-03: Uploaded structured pruning recipes, released new pre-trained models, and updated multiple benchmark results.

Citations

If you found this work helpful, please consider citing

@inproceedings{han2025leveraging,
  title={Leveraging self-supervised learning for speaker diarization},
  author={Han, Jiangyu and Landini, Federico and Rohdin, Johan and Silnova, Anna and Diez, Mireia and Burget, Luk{\'a}{\v{s}}},
  booktitle={Proc. ICASSP},
  year={2025}
}

@article{han2025fine,
  title={Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization},
  author={Han, Jiangyu and Landini, Federico and Rohdin, Johan and Silnova, Anna and Diez, Mireia and Cernocky, Jan and Burget, Lukas},
  journal={arXiv preprint arXiv:2505.24111},
  year={2025}
}

@article{han2025efficient,
  title={Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models},
  author={Han, Jiangyu and P{\'a}lka, Petr and Delcroix, Marc and Landini, Federico and Rohdin, Johan and Cernock{\`y}, Jan and Burget, Luk{\'a}{\v{s}}},
  journal={arXiv preprint arXiv:2506.18623},
  year={2025}
}

License

This repository under the MIT license.

Contact

If you have any comment or question, please contact ihan@fit.vut.cz

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
diarizen		diarizen
dscore @ 824f126		dscore @ 824f126
example		example
pyannote-audio		pyannote-audio
recipes		recipes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DiariZen

Installation

Usage

Benchmark

Updates

Citations

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

BUTSpeechFIT/DiariZen

Folders and files

Latest commit

History

Repository files navigation

DiariZen

Installation

Usage

Benchmark

Updates

Citations

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages