GitHub - LUMIA-Group/Gumbel-Reranking: Official implementation for "Gumbel Reranking: Differentiable End-to-End Reranker Optimization" (ACL 2025 Main)

Gumbel Reranking: Differentiable End-to-End Reranker Optimization

This work has been accepted to ACL 2025 Main Conference.

Related Material: Read the paper on arXiv

Introduction

Retrieval-Augmented Generation (RAG) systems rely heavily on rerankers to identify relevant documents. However, fine-tuning rerankers is challenging due to the limited availability of annotated query-document pairs. Existing distillation-based methods often suffer from training-inference misalignment and overlook interdependencies among candidate documents.

To address these issues, we reformulate reranking as a stochastic attention-mask learning problem and propose Gumbel Reranking, an end-to-end differentiable training framework. This method leverages the Gumbel Trick and Relaxed Top-k Sampling to learn document-wise Top-k attention masks, allowing reranker optimization to be directly supervised by the language model loss.

Environment Setup

Please refer to environment.yaml for dependency and environment configuration.

Data Preparation

Raw Datasets

For NQ and TQA, we use the same datasets as Fusion-in-Decoder. You can download the raw data via the script: data/get_data.sh.
For HotpotQA, MuSiQue, and 2WikiHop, please download the raw data directly from their respective official websites.

Data Format

Please preprocess your dataset into the following JSON format:

{
  "id": "0",
  "question": "What element did Marie Curie name after her native land?",
  "target": "Polonium",
  "answers": ["Polonium", "Po (chemical element)", "Po"],
  "ctxs": [
    {
      "title": "Marie Curie",
      "text": "them on visits to Poland. She named the first chemical element that she discovered in 1898 \"polonium\", after her native country..."
    },
    {
      "title": "Marie Curie",
      "text": "...they announced the existence of an element which they named \"polonium\", in honour of her native Poland..."
    }
  ]
}

We also provide preprocessed datasets ready for use: syhuang/gumbel-reranking-data.

Model Preparation

Reranker Initialization

Gumbel Reranking is designed for fine-tuning existing rerankers. While it can be trained from scratch, using pretrained rerankers yields better results.

For RankT5, use: Soyoung97/RankT5-base
For BGE-Reranker, use: BAAI/bge-reranker-base

Reader Initialization

A strong reader is required to provide supervision signals for reranker training. We assume the reader has already been fine-tuned for the specific task.

For NQ and TQA, you can directly use pretrained checkpoints provided in the Fusion-in-Decoder repo. See: readers/get_model.sh
For HotpotQA, MuSiQue, and 2WikiHop, you need to fine-tune the FiD reader using the official repo and corresponding preprocessed data.

We also release fine-tuned FiD checkpoints on Hugging Face for convenience:

Dataset	FiD-base Checkpoint	FiD-large Checkpoint
HotpotQA	`syhuang/hopo_reader_base`	`syhuang/hopo_reader_large`
MuSiQue	`syhuang/musique_reader_base`	`syhuang/musique_reader_large`
2WikiHop	`syhuang/2wiki_reader_base`	`syhuang/2wiki_reader_large`

In addition to using the FiD checkpoints, please make sure to load the original T5 configuration. Specifically, use google-t5/t5-base for FiD-base and google-t5/t5-large for FiD-large. See the base_model_path variable in run.slurm for reference.

Training

We provide a SLURM script run.slurm as a general-purpose training launcher. Please edit the script to configure your dataset paths, model names, and training hyperparameters before execution.

sbatch run.slurm

If you can SSH into the server and run bash scripts directly, you can simply execute:

bash run.slurm

Please refer to run.slurm for more details on training paths and hyperparameter settings.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
data		data
readers		readers
src		src
.gitignore		.gitignore
Readme.md		Readme.md
environment.yaml		environment.yaml
run.slurm		run.slurm
test_reader.py		test_reader.py
train_reader.py		train_reader.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gumbel Reranking: Differentiable End-to-End Reranker Optimization

Introduction

Environment Setup

Data Preparation

Raw Datasets

Data Format

Model Preparation

Reranker Initialization

Reader Initialization

Training

About

Uh oh!

Releases

Packages

Languages

LUMIA-Group/Gumbel-Reranking

Folders and files

Latest commit

History

Repository files navigation

Gumbel Reranking: Differentiable End-to-End Reranker Optimization

Introduction

Environment Setup

Data Preparation

Raw Datasets

Data Format

Model Preparation

Reranker Initialization

Reader Initialization

Training

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages