GitHub - SapienzaNLP/bookcoref: Repository of the ACL 2025 main conference paper "BOOKCOREF: Coreference Resolution at Book Scale"

Description

This repository contains the official code for the ACL 2025 main conference paper: BookCoref: Coreference Resolution at Book Scale by Giuliano Martinelli, Tommaso Bonomo, Pere-lluìs Huguet Cabot and Roberto Navigli. We include the official outputs of the comparison systems outlined in the paper, which can be used to reproduce our results. Our silver training and gold evaluation data are available through this 🤗 Hugging Face dataset.

Setup

First of all, clone the repository:

git clone https://github.com/sapienzanlp/bookcoref.git

Then, create a Python virtual environment and install the requirements. We support Python 3.9 and above.

pip install -r requirements.txt

BookCoref Data

Local Download

To download the BookCoref data for training and evaluation, run the download_data.py script:

python download_data.py

options:
  --format <"jsonl" or "conll">, default="jsonl" # Format of the dataset to download
  --configuration <"default" or "split">, default="default" # Configuration of the huggingface dataset, either 'default' or 'split'
  --output_dir <path>, default="data/" # If specified, the output directory for the dataset

This script will download data from 🤗 Hugging Face and save it in either JSONL or CoNLL format to the default directory data/.

Data format

BookCoref is a collection of annotated books. Each item contains the annotations of one book following the structure of OntoNotes:

{
  doc_id: "pride_and_prejudice_1342", # (str) i.e., ID of the document 
  gutenberg_key: "1342", # (str) i.e., key of the book in Project Gutenberg
  sentences: [["CHAPTER", "I."], ["It", "is", "a", "truth", "universally", "acknowledged", ...], ...], # list[list[str]] i.e., list of word-tokenized sentences
  clusters: [[[79,80], [81,82], ...], [[2727,2728]...], ...], # list[list[list[int]]] i.e., list of clusters' mention offsets
  characters: [
    {
      name: "Mr Bennet", 
      cluster: [[79,80], ...],
    },
    {
      name: "Mr. Darcy",
      cluster: [[2727,2728], [2729,2730], ...],
    }
  ] # list[character], list of characters objects consisting of name and mentions offsets, i,e., dict[name: str, cluster: list[list[int]]]
}

We also include informations on character names, which is not exploited in traditional coreference settings, but could be useful in future work.

BookCoref Evaluation

To evaluate the outputs of a model on the BookCoref benchmark, run the evaluate.py script:

python evaluate.py

options:
  --predictions <path_to_predictions> # Path to the predictions file to evaluate.
  --mode <"full", "split", "gold_window">, default="full" # Evaluation mode.

We provide three evaluation modes:

Mode	Description
`full`	Evaluate model predictions on the full books of `test.jsonl`. Input: expects as input predictions on the full test set books. Output: scores on the full books of `test.jsonl`, referred to as BookCoref_gold results in our paper.
`split`	Evaluate model predictions on `test_split.jsonl`. Input: expects as input predictions on the split version of our test set books. Output: scores on the split version (`test_split.jsonl`), referred to as Split-BookCoref_gold results in our paper.
`gold_window`	Evaluate model predictions carried out on the full `test.jsonl` but evaluated on `test_split.jsonl`, by splitting clusters every 1500 tokens. Input: expects as input predictions on the full test set books. Output: scores on the split version (`test_split.jsonl`), referred to as BookCoref_gold+window results in our paper.

Replicate Paper Results

To replicate the results of our paper, run evaluate.py specifying the path to the predictions of the model you are interested in.

Example:

$ python evaluate.py --predictions predictions/finetuned_bookcoref/maverick_xl.jsonl
Evaluation Results:
muc:
  precision: 92.95
  recall: 95.70
  f1: 94.30
b_cubed:
  precision: 43.08
  recall: 77.19
  f1: 55.30
ceafe:
  precision: 37.10
  recall: 30.46
  f1: 33.45
conll2012:
  precision: 57.71
  recall: 67.78
  f1: 61.02

Citation

This work has been published at ACL 2025 (main conference). If you use any artifact, please cite our paper as follows:

@inproceedings{martinelli-etal-2025-bookcoref,
    title = "{BOOKCOREF}: Coreference Resolution at Book Scale",
    author = "Martinelli, Giuliano  and
      Bonomo, Tommaso  and
      Huguet Cabot, Pere-Llu{\'i}s  and
      Navigli, Roberto",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-long.1197/",
    pages = "24526--24544",
    ISBN = "979-8-89176-251-0",
    }

License

The data and software are licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
assets		assets
predictions		predictions
src		src
.gitignore		.gitignore
README.md		README.md
download_data.py		download_data.py
evaluate.py		evaluate.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Description

Setup

BookCoref Data

Local Download

Data format

BookCoref Evaluation

Replicate Paper Results

Citation

License

About

Uh oh!

Packages

Contributors 3

Uh oh!

Languages

SapienzaNLP/bookcoref

Folders and files

Latest commit

History

Repository files navigation

Description

Setup

BookCoref Data

Local Download

Data format

BookCoref Evaluation

Replicate Paper Results

Citation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Contributors 3

Uh oh!

Languages

Packages