RTP-LX

Dataset for the paper RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios?, by De Wynter et al.

NOTE: This repo is actively updated!

WARNING: This repository contains and discusses content that is offensive or upsetting. All materials are intended to support research that improves toxicity detection methods. Included examples of toxicity do not represent how the authors or sponsors feel about any identity groups. This corpus was made by a multi-national, multi-cultural team of various faiths, beliefs, and origins. Please note that toxicity is dynamic, evolves with societal perceptions, and these labels may change.

What is RTP-LX?

RTP LX is a multilingual set of 1k+ (per locale) toxic prompts and passages designed for toxicity evaluation. It is manually translated from a subset of the original RTP dataset, and annotated by native speakers. It also includes:

Coverage in 38 languages (but more files because graphically-distinct dialects like ZH-Hans vs. ZH-Hant and DE-de vs. DE-ch are treated separately)
Manually designed prompts that are considered "hard" to translate to English, and could be considered offensive in the language's geolocale.
Translations may include dialect-specific indications (e.g., Levantine Arabic, Brazilian Portuguese)

Languages covered

RTP-LX currently covers 38 languages:

Arabic (Egyptian, Levantine, Saudi)
BCMS
Bulgarian*
Catalan*
Chinese (standard, simplified and standard, traditional)
Czech
Danish
Dutch
English
Estonian*
Finnish
French (France)
German (standard, Germany and standard, Switzerland*)
Greek
Hebrew
Hindi
Hungarian
Indonesian
Italian
Japanese
Korean
Latvian*
Lithuanian*
Norwegian (Bokmål)
Polish
Portuguese (Brazilian, Portuguese)
Romanian*
Russian (Russia, Ukraine)
Slovak*
Slovenian*
Spanish (Spain)
Swahili
Swedish
Thai
Turkish
Ukrainian
Vietnamese*
Welsh*

See Structure, below, for the languages marked with an asterisk (*).

Harm Categories

RTP-LX is annotated in the following categories:

Bias
Insult
Identity Attack
Microagression
Violence
Self-harm
Sexual content
Overall toxicity

Structure

RTP-LX has two main components: prompts (human transcreated, human annotated), and completions (synthetically generated, human annotated).

Prompts are meant to measure the effectiveness of your guardrails in multilingual scenarios, as well as automated annotation capabilities.
Completions, on the other hand, are much, much more toxic and are designed for ablation analysis of harm categories.
BenignCompletions are human-written completions -- perfect for DPO!
PromptAnnotations and CompletionsAnnotations contain the aggregated (majority vote) scores for the users.
The languages marked with an asterisk (*) do not contain Completions or the culturally-specific prompts (budgetary reasons)

Uncompressing

To avoid crawlers, we have zipped and password-protected the entries. Please use the name of the repo all in lowercase plus "-entries" and -4/8/24 as the password. So if the repo is "ASDF-BLAH", you want asdf-blah-entries-4/8/24.

Updates:

(December '24): The paper for RTP-LX got accepted to AAAI! We will post the CR version soon.
(August '24): V1.5 released! Added 11 new languages: BG, CA, ET, HE, LV, LT, RO, SK, SL, VI, CY and one dialect (DE-CH)
(May '24): Benign set released, scoring updated to what we described in the paper.
(Apr '24): Paper released!
(Mar '24): V1.0 released! Passages annotated. This is the first full release of RTP-LX. We do have updates coming, so stay tuned.
(Jan '24): V0.3 released! Added SW/BCMS. Compressed to file. Passages to come soon.
(Dec '23): V0.2 released! Added 19 more languages, and included PT (pt) prompts. Note that BCMS/Swahili are projected for a later date.
(Sep '23): V0.1 released! Prompts for ES, FR, DE, IT, JA, PT (br), ZH (simplified), AR and CS.

Citation

If you use our work, please consider citing our paper. Proper Bibtex is here, but this one is fixed to be less unwieldy:

@article{rtplx,
    title={RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios?},
    volume={39},
    url={https://ojs.aaai.org/index.php/AAAI/article/view/35011},
    DOI={10.1609/aaai.v39i27.35011},
    number={27},
    journal={Proceedings of the AAAI Conference on Artificial Intelligence},
    author={de Wynter, Adrian and Watts, Ishaan and Wongsangaroonsri, Tua and Zhang, Minghui and Farra, Noura and Altıntoprak, Nektar Ege and Baur, Lena and Claudet, Samantha and Gajdušek, Pavel and Gu, Qilong and Kaminska, Anna and Kaminski, Tomasz and Kuo, Ruby and Kyuba, Akiko and Lee, Jongho and Mathur, Kartik and Merok, Petter and Milovanović, Ivana and Paananen, Nani and Paananen, Vesa-Matti and Pavlenko, Anna and Vidal, Bruno Pereira and Strika, Luciano Ivan and Tsao, Yueh and Turcato, Davide and Vakhno, Oleksandr and Velcsov, Judit and Vickers, Anna and Visser, Stéphanie F. and Widarmanto, Herdyan and Zaikin, Andrey and Chen, Si-Qing},
    year={2025},
    month={Apr.},
    pages={27940-27950}
}

along with the original RTP paper:

@inproceedings{gehman-etal-2020-realtoxicityprompts,
    title = "{R}eal{T}oxicity{P}rompts: Evaluating Neural Toxic Degeneration in Language Models",
    author = "Gehman, Samuel  and
      Gururangan, Suchin  and
      Sap, Maarten  and
      Choi, Yejin  and
      Smith, Noah A.",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.301",
    doi = "10.18653/v1/2020.findings-emnlp.301",
    pages = "3356--3369",
}

Some components for Hebrew, Danish, Korean and Brazilian Portuguese come from the Offensive Hebrew Corpus, DKHate, BEEP! and ToLD-BR corpora, respectively. Please consider citing their work as well:

@inproceedings{
  title = {Offensive {H}ebrew Corpus and Detection using {BERT}},
  author = {Nagham Hamad and Mustafa Jarrar and Mohammed Khalilia and Nadim Nashif},
  booktitle = {The 20th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA)},
  year = {2023},
  publisher = {IEEE},
  address = {Egypt}
}

@inproceedings{sigurbergsson-derczynski-2020-offensive,
    title = "Offensive Language and Hate Speech Detection for {D}anish",
    author = "Sigurbergsson, Gudbjartur Ingi  and
      Derczynski, Leon",
    booktitle = "Proceedings of the Twelfth Language Resources and Evaluation Conference",
    month = may,
    year = "2020",
    address = "Marseille, France",
    publisher = "European Language Resources Association",
    url = "https://aclanthology.org/2020.lrec-1.430",
    pages = "3498--3508",
    language = "English",
    ISBN = "979-10-95546-34-4",
}

@inproceedings{moon-etal-2020-beep,
    title = "{BEEP}! {K}orean Corpus of Online News Comments for Toxic Speech Detection",
    author = "Moon, Jihyung  and
      Cho, Won Ik  and
      Lee, Junbum",
    booktitle = "Proceedings of the Eighth International Workshop on Natural Language Processing for Social Media",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.socialnlp-1.4",
    pages = "25--31",
}

@inproceedings{ToLDBR,
  author = {Jo\~{a}o A. Leite and Diego F. Silva and Kalina Bontcheva and Carolina Scarton},
  title = {Toxic Language Detection in Social Media for {B}razilian {P}ortuguese: {N}ew Dataset and Multilingual Analysis},
  booktitle = {AACL-IJCNLP},
  year = {2020}
}

Contributing

See here.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
RTP-LX		RTP-LX
code		code
rubric		rubric
scoring		scoring
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
TRADEMARKS.md		TRADEMARKS.md
rtp-lx_metadata.json		rtp-lx_metadata.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RTP-LX

What is RTP-LX?

Languages covered

Harm Categories

Structure

Uncompressing

Updates:

Citation

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

microsoft/RTP-LX

Folders and files

Latest commit

History

Repository files navigation

RTP-LX

What is RTP-LX?

Languages covered

Harm Categories

Structure

Uncompressing

Updates:

Citation

Contributing

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages