Skip to content

Repository of the ACL 2025 main conference paper "BOOKCOREF: Coreference Resolution at Book Scale"

Notifications You must be signed in to change notification settings

SapienzaNLP/bookcoref

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Conference Paper Hugging Face Dataset License: CC BY-NC 4.0

Description

This repository contains the official code for the ACL 2025 main conference paper: BookCoref: Coreference Resolution at Book Scale by Giuliano Martinelli, Tommaso Bonomo, Pere-lluìs Huguet Cabot and Roberto Navigli. We include the official outputs of the comparison systems outlined in the paper, which can be used to reproduce our results. Our silver training and gold evaluation data are available through this 🤗 Hugging Face dataset.

Setup

First of all, clone the repository:

git clone https://github.com/sapienzanlp/bookcoref.git

Then, create a Python virtual environment and install the requirements. We support Python 3.9 and above.

pip install -r requirements.txt

BookCoref Data

Local Download

To download the BookCoref data for training and evaluation, run the download_data.py script:

python download_data.py

options:
  --format <"jsonl" or "conll">, default="jsonl" # Format of the dataset to download
  --configuration <"default" or "split">, default="default" # Configuration of the huggingface dataset, either 'default' or 'split'
  --output_dir <path>, default="data/" # If specified, the output directory for the dataset

This script will download data from 🤗 Hugging Face and save it in either JSONL or CoNLL format to the default directory data/.

Data format

BookCoref is a collection of annotated books. Each item contains the annotations of one book following the structure of OntoNotes:

{
  doc_id: "pride_and_prejudice_1342", # (str) i.e., ID of the document 
  gutenberg_key: "1342", # (str) i.e., key of the book in Project Gutenberg
  sentences: [["CHAPTER", "I."], ["It", "is", "a", "truth", "universally", "acknowledged", ...], ...], # list[list[str]] i.e., list of word-tokenized sentences
  clusters: [[[79,80], [81,82], ...], [[2727,2728]...], ...], # list[list[list[int]]] i.e., list of clusters' mention offsets
  characters: [
    {
      name: "Mr Bennet", 
      cluster: [[79,80], ...],
    },
    {
      name: "Mr. Darcy",
      cluster: [[2727,2728], [2729,2730], ...],
    }
  ] # list[character], list of characters objects consisting of name and mentions offsets, i,e., dict[name: str, cluster: list[list[int]]]
}

We also include informations on character names, which is not exploited in traditional coreference settings, but could be useful in future work.

BookCoref Evaluation

To evaluate the outputs of a model on the BookCoref benchmark, run the evaluate.py script:

python evaluate.py

options:
  --predictions <path_to_predictions> # Path to the predictions file to evaluate.
  --mode <"full", "split", "gold_window">, default="full" # Evaluation mode.

We provide three evaluation modes:

Mode Description
full Evaluate model predictions on the full books of test.jsonl.
Input: expects as input predictions on the full test set books.
Output: scores on the full books of test.jsonl, referred to as BookCorefgold results in our paper.
split Evaluate model predictions on test_split.jsonl.
Input: expects as input predictions on the split version of our test set books.
Output: scores on the split version (test_split.jsonl), referred to as Split-BookCorefgold results in our paper.
gold_window Evaluate model predictions carried out on the full test.jsonl but evaluated on test_split.jsonl, by splitting clusters every 1500 tokens.
Input: expects as input predictions on the full test set books.
Output: scores on the split version (test_split.jsonl), referred to as BookCorefgold+window results in our paper.

Replicate Paper Results

To replicate the results of our paper, run evaluate.py specifying the path to the predictions of the model you are interested in.

Example:

$ python evaluate.py --predictions predictions/finetuned_bookcoref/maverick_xl.jsonl
Evaluation Results:
muc:
  precision: 92.95
  recall: 95.70
  f1: 94.30
b_cubed:
  precision: 43.08
  recall: 77.19
  f1: 55.30
ceafe:
  precision: 37.10
  recall: 30.46
  f1: 33.45
conll2012:
  precision: 57.71
  recall: 67.78
  f1: 61.02

Citation

This work has been published at ACL 2025 (main conference). If you use any artifact, please cite our paper as follows:

@inproceedings{martinelli-etal-2025-bookcoref,
    title = "{BOOKCOREF}: Coreference Resolution at Book Scale",
    author = "Martinelli, Giuliano  and
      Bonomo, Tommaso  and
      Huguet Cabot, Pere-Llu{\'i}s  and
      Navigli, Roberto",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-long.1197/",
    pages = "24526--24544",
    ISBN = "979-8-89176-251-0",
    }

License

The data and software are licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0.

About

Repository of the ACL 2025 main conference paper "BOOKCOREF: Coreference Resolution at Book Scale"

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages