Skip to content

iai-group/ginger-response-generation

Repository files navigation

GINGER: Grounded Information Nugget-Based Generation of Responses

Code style: black

Summary

We present a modular pipeline for Grounded Information Nugget-Based GEneration of R*esponses (GINGER). The main novelty of our approach compared to existing RAG approaches is that it operates on information nuggets, which are atomic units of relevant information1. Given a set of passages retrieved in response to a user query, our approach identifies information nuggets in top passages, clusters them by query facet, ranks clusters by relevance, summarizes the top ones, and refines the response for fluency and coherence. GINGER uniquely models query facets to ensure the inclusion of the maximum number of unique pieces of information answering the question. This approach can significantly improve the user experience by ensuring the grounding of the final response in the source passages and enabling easy verifiability of source attribution.

Response Generation Pipeline

In this paper we propose a modular response generation pipeline that 1) ensures grounding of the response in specific facts from the retrieved sources, and 2) uniquely models query facets to ensure the inclusion of the maximum number of unique pieces of information answering the question. This approach can significantly improve the user experience by ensuring the grounding of the final response in the source passages and enabling easy verifiability of source attribution.

alt text

Our method operates on information nuggets defined as 'minimal, atomic units of relevant information' of retrieved documents that have been proposed for automatic evaluation of passage relevance1. By operating on information nuggets in all intermediate components of the pipeline we ensure the grounding of the final response in the source passages and enable easy attribution verifiability. The details about implementation of specific components of the pipeline are covered in detail here.

Data

We evaluate our system on the augmented generation task within the recently launched Retrieval-Augmented Generation track at Text REtrieval Conference (TREC RAG'24)2.

The input passages, generated data, and human scores collected for evaluation are covered in detail here.

Results

Details about the evaluation can be found here.

Footnotes

  1. Virgil Pavlu, Shahzad Rajput, Peter B. Golbus, and Javed A. Aslam. 2012. IR system evaluation using nugget-based test collections. In WSDM ’12. 2

  2. Ronak Pradeep, Nandan Thakur, Shivani Upadhyay, Daniel Campos, Nick Craswell, and Jimmy Lin. 2024. Initial Nugget Evaluation Results for the TREC 2024 RAG Track with the AutoNuggetizer Framework. arXiv:2411.09607 [cs.IR]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages