The ABX-accent project is based on the preparation and evaluation of the Accented English Speech Recognition Challenge (AESRC) dataset [1], using fastABX [2] for evaluation. This repository provides all the items files you can use for evaluation.
The ABX metric evaluates whether a representation X of a speech unit (e.g., the triphone “bap”) is closer to a same-category example A (also “bap”) than to a different-category example B (e.g., “bop”). The ABX error rate is calculated by averaging the discrimination errors over all minimal triphone pairs (ie., differing only by the central phoneme) in the corpus. This benchmark focuses on the more challenging ABX across(and within) speaker task, where the X example is spoken by a different speaker than the ones in pair (A, B), testing speaker-invariant phonetic discrimination.
The Accented English Speech Recognition Challenge dataset includes recordings from ten different regional accents: American, British, Canadian, Chinese, Indian, Japanese, Korean, Portuguese, Spanish, Russian.
To begin working with the AESRC development data and run evaluations, you will find the following resources in the repository:
- To get the dataset, please go to their website and ask for "Interspeech_ Accented English Speech Recognition Competition Data" .
- Scripts for both data preparation and evaluation .
- The evaluation result.
Setup instructions are provided for Linux systems, and the process has been successfully tested on various distributions, including Ubuntu 16.04, Debian Jessie, and CentOS 6. It should also work on macOS with minimal modifications.
abx-accent/
├── scripts
│ └── prepare/
│ │ └── data_splits
│ │ └── abkhazia (forced_alignment)
│ └── README.md
│ └── eval/
│ │ └── generate_item_files
│ │ └── generate_abx_score
│ │ └── fastABX/
│ └── environment.yml
│ └── README.md
├── data
│ └── prepare/
│ │ └── data_splits
│ │ └── abkhazia (forced_alignment)
│ │ └── gender_speakers_list.txt
│ └── README.md
│ └── eval/
│ │ └── item_files
│ │ │ └── dev_set
│ │ │ └── test_set
│ │ └── fastABX
│ │ │ └── across_task
│ │ │ └── within_task
│ └── README.md
│README.md
Copyright 2022 CoML team (ENS, CNRS, INRIA, EHESS)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
-
[1] Xian Shi, Fan Yu, Yizhou Lu, Yuhao Liang, Qiangze Feng, Daliang Wang, Yanmin Qian, and Lei Xie, “The accented english speech recognition challenge 2020: open datasets, tracks, baselines, results and methods,” in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).IEEE, 2021, pp. 6918–6922.
-
[2] Maxime Poli, Emmanuel Chemla, Emmanuel Dupoux "fastabx: A library for efficient computation of ABX discriminability" arXiv:2505.02692v1 [cs.CL] 5 May 2025.