Subjective Text Complexity Corpus for German [Paper]

A corpus consisting of German sentences, annotated with subjective complexity ratings by two target groups.

322 sentences annotated with complexity ratings of (1) experts and (2) non-experts on a 5-point-Likert scale (1-very easy to 5-very complex).

Data comes from DATEV, a German IT service provider in the context of German tax consultants, auditors, and lawyers. The sentences have been extracted from 232 documents regarding instructions, commentaries and descriptions which address employees of the service provider, as well as external users of the system. They often describe technical solutions to the company's products or give more detailed descriptions about law regulations affecting the company's clients.

Citation

If you find the code or dataset patch helpful, please cite the following paper:

@inproceedings{seiffe-etal-2022-subjective,
    title = "Subjective Text Complexity Assessment for {G}erman",
    author = {Seiffe, Laura  and
      Kallel, Fares  and
      M{\"o}ller, Sebastian  and
      Naderi, Babak  and
      Roller, Roland},
    editor = "Calzolari, Nicoletta  and
      B{\'e}chet, Fr{\'e}d{\'e}ric  and
      Blache, Philippe  and
      Choukri, Khalid  and
      Cieri, Christopher  and
      Declerck, Thierry  and
      Goggi, Sara  and
      Isahara, Hitoshi  and
      Maegaard, Bente  and
      Mariani, Joseph  and
      Mazo, H{\'e}l{\`e}ne  and
      Odijk, Jan  and
      Piperidis, Stelios",
    booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
    month = jun,
    year = "2022",
    address = "Marseille, France",
    publisher = "European Language Resources Association",
    url = "https://aclanthology.org/2022.lrec-1.74/",
    pages = "707--714"
}

License

The code is released under the under terms of the CC-BY-4.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
All_Features_Subjective_Text_Complexity.xlsx		All_Features_Subjective_Text_Complexity.xlsx
ComplexityDataset_expert_nonexpert_rating.xlsx		ComplexityDataset_expert_nonexpert_rating.xlsx
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Subjective Text Complexity Corpus for German [Paper]

Citation

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

License

DFKI-NLP/subjective_text_complexity_corpus

Folders and files

Latest commit

History

Repository files navigation

Subjective Text Complexity Corpus for German [Paper]

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Packages