bunyaminergen
diff --git a/‎.docs/documentation/CONTRIBUTING.md
Lines changed: 506 additions & 0 deletions b/‎.docs/documentation/CONTRIBUTING.md
Lines changed: 506 additions & 0 deletions
diff --git a/‎.docs/documentation/RESOURCES.md
Lines changed: 126 additions & 0 deletions b/‎.docs/documentation/RESOURCES.md
Lines changed: 126 additions & 0 deletions
@@ -0,0 +1,126 @@
+# Resources
+
+---
+
+## Paper
+
+- [\[2110.13900\] WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing](https://arxiv.org/abs/2110.13900)
+- [\[1904.08104\] RawNet: Advanced end-to-end deep neural network for speaker verification](https://arxiv.org/abs/1904.08104)
+- [Improved RawNet with Feature Map Scaling for Text-Independent Speaker Verification Using Raw Waveforms (ISCA Interspeech 2020)](https://www.isca-archive.org/interspeech_2020/jung20c_interspeech.html)
+- [\[2203.08488\] Pushing the limits of raw waveform speaker recognition](https://arxiv.org/abs/2203.08488)
+- [\[2406.07103\] MR-RawNet: multiple temporal resolutions for variable duration utterances](https://arxiv.org/abs/2406.07103)
+- [\[2011.01108\] End-to-end anti-spoofing with RawNet2](https://arxiv.org/abs/2011.01108)
+- [\[1808.00158\] Speaker Recognition from Raw Waveform with SincNet](https://arxiv.org/abs/1808.00158)
+- [SincConv in SE](https://arxiv.org/pdf/2403.01785)
+- [\[1709.01507\] Squeeze-and-Excitation Networks](https://arxiv.org/abs/1709.01507)
+- [\[1803.10963\] Attentive Statistics Pooling for Deep Speaker Embedding](https://arxiv.org/abs/1803.10963)
+- [\[2011.05189\] Supervised attention for speaker recognition](https://arxiv.org/abs/2011.05189)
+- [Enhancing speaker identification through reverberation modeling and cancelable techniques using ANNs | PLOS ONE](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0294235)
+- [ISCA Archive - Enroll-Aware Attentive Statistics Pooling for Target Speaker Verification](https://www.isca-archive.org/interspeech_2022/zhang22j_interspeech.html#:~:text=title%20%20%20%20,1796)
+- [Voxceleb: Large-scale speaker verification in the wild](https://www.robots.ox.ac.uk/~vgg/publications/2019/Nagrani19/nagrani19.pdf)
+- [chung18a.pdf](https://www.robots.ox.ac.uk/~vgg/publications/2018/Chung18a/chung18a.pdf)
+- [nagrani17.pdf](https://www.robots.ox.ac.uk/~vgg/publications/2017/Nagrani17/nagrani17.pdf)
+- [\[2408.14886\] The VoxCeleb Speaker Recognition Challenge: A Retrospective](https://arxiv.org/abs/2408.14886)
+- [\[1912.07875\] Libri-Light: A Benchmark for ASR with Limited or No Supervision](https://arxiv.org/abs/1912.07875)
+- [\[2106.06909\] GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio](https://arxiv.org/abs/2106.06909)
+- [\[2101.00390\] VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation](https://arxiv.org/abs/2101.00390)
+- [2018_icassp_xvectors.pdf](https://www.danielpovey.com/files/2018_icassp_xvectors.pdf)
+- [\[2005.07143\] ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification](https://arxiv.org/abs/2005.07143)
+- [\[1706.08612\] VoxCeleb: a large-scale speaker identification dataset](https://arxiv.org/abs/1706.08612)
+- [\[2005.07143 (PDF)\]](https://arxiv.org/pdf/2005.07143)
+- [\[2401.17230v2\] ESPnet-SPK: full pipeline speaker embedding toolkit... (arXiv)](https://arxiv.org/abs/2401.17230v2)
+- [\[2401.17230v2 (PDF)\]](https://arxiv.org/pdf/2401.17230v2.pdf)
+- [\[2110.13900\] (PDF)](https://arxiv.org/pdf/2110.13900)
+- [\[2407.18223\] Reshape Dimensions Network for Speaker Recognition](https://arxiv.org/abs/2407.18223)
+- [Voxceleb: : Large-scale speaker verification in the wild: Computer Speech and Language: Vol 60, No C](https://dl.acm.org/doi/10.1016/j.csl.2019.101027)
+- [\[1912.02522\] VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge](https://arxiv.org/abs/1912.02522)
+- [\[1904.08779\] SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition](https://arxiv.org/abs/1904.08779)
+- [WavLM model ensemble for audio deepfake detection](https://arxiv.org/html/2408.07414v1)
+
+---
+
+## Github
+
+- [unilm/wavlm at master · microsoft/unilm](https://github.com/microsoft/unilm/tree/master/wavlm)
+- [unilm/wavlm/modules.py at master · microsoft/unilm](https://github.com/microsoft/unilm/blob/master/wavlm/modules.py)
+- [libri-light/data_preparation/README.md at main · facebookresearch/libri-light](https://github.com/facebookresearch/libri-light/blob/main/data_preparation/README.md)
+- [bunyaminergen/WavLMMSDD](https://github.com/bunyaminergen/WavLMMSDD)
+- [Jungjee/RawNet: Official repository for RawNet, RawNet2, and RawNet3](https://github.com/Jungjee/RawNet)
+- [KrishnaDN/RawNet (implementation of RawNet paper)](https://github.com/KrishnaDN/RawNet)
+- [facebookresearch/libri-light](https://github.com/facebookresearch/libri-light)
+- [espnet/espnet (End-to-End Speech Processing Toolkit)](https://github.com/espnet/espnet)
+- [IDRnD/redimnet](https://github.com/IDRnD/redimnet/blob/master/EVALUATION.md)
+- [clovaai/voxceleb_trainer (In defence of metric learning for speaker recognition)](https://github.com/clovaai/voxceleb_trainer)
+- [IDRnD/redimnet: The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"](https://github.com/IDRnD/redimnet)
+- [kimho1wq/MR-RawNet: This repository contains official pytorch implementation and pre-trained models for the MR-RawNet.](https://github.com/kimho1wq/mr-rawnet)
+
+---
+
+## Web
+
+- [Wavlm Base Sv · Models · Dataloop](https://dataloop.ai/library/model/microsoft_wavlm-base-sv/)
+- [Fine-tuning wav2vec2 for speaker recognition | Papers With Code](https://paperswithcode.com/paper/fine-tuning-wav2vec2-for-speaker-recognition)
+- [ESPnet-SPK: full pipeline... | Papers With Code](https://paperswithcode.com/paper/espnet-spk-full-pipeline-speaker-embedding)
+- [What is Equal Error Rate (EER)? | Webopedia](https://www.webopedia.com/definitions/equal-error-rate/)
+- [Performance for Speaker Identification (EER) - Stack Overflow](https://stackoverflow.com/questions/43315277/performance-for-speaker-identification-equal-error-rate-eer-and-identificati)
+- [Home (RawNet 2024)](https://sites.google.com/view/rawnet-2024/)
+- [ISCA Archive - Interspeech 2020 Jung20c (RawNet)](https://www.isca-archive.org/interspeech_2020/jung20c_interspeech.html)
+- [ResearchGate (RawNet paper)](https://www.researchgate.net/publication/335829649_RawNet_Advanced_End-to-End_Deep_Neural_Network_Using_Raw_Waveforms_for_Text-Independent_Speaker_Verification)
+- [Information Engineering (robots.ox.ac.uk)](https://www.robots.ox.ac.uk/)
+- [VoxCeleb: a large-scale speaker identification dataset | Papers With Code](https://cs.paperswithcode.com/paper/voxceleb-a-large-scale-speaker-identification)
+- [Full Text Search - Hugging Face (VoxCeleb)](https://huggingface.co/search/full-text?q=voxceleb&type=dataset)
+- [SITW_overlap.txt](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/SITW_overlap.txt)
+- [VoxCeleb Speaker Recognition Challenge](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/competition2019.html)
+- [The Speakers in the Wild (SITW) Speaker Recognition Database - SRI](https://www.sri.com/publication/speech-natural-language-pubs/the-speakers-in-the-wild-sitw-speaker-recognition-database/)
+
+---
+
+## Hugging Face
+
+- [microsoft/wavlm-large](https://huggingface.co/microsoft/wavlm-large)
+- [microsoft/wavlm-base-plus-sv](https://huggingface.co/microsoft/wavlm-base-plus-sv)
+- [microsoft/wavlm-base](https://huggingface.co/microsoft/wavlm-base)
+- [WavLM Documentation](https://huggingface.co/docs/transformers/model_doc/wavlm#transformers.WavLMForXVector.config)
+- [What is Feature Extraction? - Hugging Face](https://huggingface.co/tasks/feature-extraction)
+- [Fine-Tune Wav2Vec2 for English ASR in Hugging Face with Transformers](https://huggingface.co/blog/fine-tune-wav2vec2-english)
+- [WavLMMSDD - a Hugging Face Space by bunyaminergen](https://huggingface.co/spaces/bunyaminergen/WavLMMSDD)
+- [jungjee/RawNet3](https://huggingface.co/jungjee/RawNet3)
+- [espnet/voxcelebs12_rawnet3](https://huggingface.co/espnet/voxcelebs12_rawnet3)
+- [openslr/librispeech_asr](https://huggingface.co/datasets/openslr/librispeech_asr)
+- [yangwang825/vox1-veri-full](https://huggingface.co/datasets/yangwang825/vox1-veri-full)
+- [yangwang825/vox1-iden-3s](https://huggingface.co/datasets/yangwang825/vox1-iden-3s)
+- [101arrowz/vox_celeb](https://huggingface.co/datasets/101arrowz/vox_celeb)
+- [TwinkStart/VoxCeleb](https://huggingface.co/datasets/TwinkStart/VoxCeleb)
+
+---
+
+## Wikipedia
+
+- [Time delay neural network](https://en.wikipedia.org/wiki/Time_delay_neural_network)
+
+---
+
+## Dataset
+
+- [Libri-light](https://ai.meta.com/tools/libri-light/)
+- [Libri-Light Dataset | Papers With Code](https://paperswithcode.com/dataset/libri-light)
+- [libri-light/data_preparation at main · facebookresearch/libri-light](https://github.com/facebookresearch/libri-light/tree/main/data_preparation)
+- [talkbank/callhome · Datasets at Hugging Face](https://huggingface.co/datasets/talkbank/callhome?row=1)
+- [VoxCeleb](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/vox1.html)
+- [veri_test.txt (VoxCeleb1)](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/veri_test.txt)
+- [VoxCeleb1 Dataset | Papers With Code](https://paperswithcode.com/dataset/voxceleb1)
+- [VoxCeleb Benchmark (Speaker Verification) | Papers With Code](https://paperswithcode.com/sota/speaker-verification-on-voxceleb)
+- [VoxCeleb1 Benchmark (Speaker Recognition) | Papers With Code](https://paperswithcode.com/sota/speaker-recognition-on-voxceleb1)
+- [The Speakers in the Wild (SITW) Speaker Recognition Database - SRI](https://www.sri.com/publication/speech-natural-language-pubs/the-speakers-in-the-wild-sitw-speaker-recognition-database/)
+- [SITW_overlap.txt (overlap list)](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/SITW_overlap.txt)
+- [VoxCeleb](https://mm.kaist.ac.kr/datasets/voxceleb/)
+- [KAIST MM](https://cn01.mmai.io/keyreq/voxceleb)
+- [Speaker Verification Datasets | Papers With Code](https://paperswithcode.com/datasets?q=Speaker+Verification&v=lst&o=match)
+
+---
+
+## Youtube
+
+- [RawNet Explained + Code](https://www.youtube.com/watch?v=9lOkPtilD74)
+
+---