Skip to content

malikaltakrori/Topic-Confusion-for-authorship-attribution-EMNLP-2021-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

The Topic Confusion Task - EMNLP 2021 (Findings)

  1. Cross Topic/commandline__.txt contains commands to run the experiments from cmd.

  2. There might be additional packages that should be installed, such as the StanfordNLP tokenizer, and scikit-learn.

Cross-Topic Authorship Attribution

    Same-Topic     Cross-Topic     Topic Confusion

The Topic Confusion Task

  1. Topic confusion/commandline.txt contains commands to run the experiments from cmd.

  2. Topic confusion/main.py has all the baselines/methods from the paper to create Table 2. However, certain sections should be commented out to avoid running the code for too long (magnitude of days).

  3. Topic confusion/new_model.py Topic is used to create a new model.

Data

We do not have the rights to share the actual data as per the Guardian API policy. Detailed instructions to collect the data can be found here

Don't forget to cite our paper!

@inproceedings{altakrori2021topic,
    title={The Topic Confusion Task: A Novel Evaluation Scenario for Authorship Attribution},
    author={Altakrori, Malik and Cheung, Jackie Chi Kit and Fung, Benjamin CM},
    booktitle={Findings of the Association for Computational Linguistics: EMNLP 2021},
    pages={4242--4256},
    year={2021}
}

About

EMNLP paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages