How to create pipeline for Re2G Architecture? #4578

muazhari · 2023-04-02T21:47:07Z

muazhari
Apr 2, 2023

Is there any off-the-self implementation of Re2G Architecture? If it isn't, how to create it? I think it will surpass current SOTA if implemented using this ensemble retrieval method, re-ranker, and OpenAI prompt for the generator.

Sources:

https://arxiv.org/abs/2207.06300v1

ZanSara · 2023-04-04T11:04:41Z

ZanSara
Apr 4, 2023

Hey @muazhari, this is definitely possible with Haystack. Unfortunately we don't have a tutorial exactly on this topic, but I can share the definition of a Haystack Pipeline that does something like what you explain:

version: 1.15.0

components:
  - name: DocumentStore
    type: ElasticsearchDocumentStore
  - name: BM25Retriever # The keyword-based retriever
    type: BM25Retriever
    params:
      document_store: DocumentStore
      top_k: 20
  - name: EmbeddingRetriever # The vector-based retriever
    type: EmbeddingRetriever
    params:
      document_store: DocumentStore
      embedding_model: sentence-transformers/multi-qa-mpnet-base-dot-v1 # Model optimized for semantic search
      model_format: sentence_transformers
      top_k: 20
  - name: JoinResults # Joins the results from both retrievers
    type: JoinDocuments
    params:
      join_mode: reciprocal_rank_fusion # Applies rank-based scoring to the results
  - name: AnswerGen # Generates candidate answers based on the documents it gets from the retriever
    type: OpenAIAnswerGenerator 
    params:
      api_key: <YOUR API KEY>
      model: text-davinci-003
      max_tokens: 200
      temperature: 0.8
      frequency_penalty: 0.1
      presence_penalty: 0.1
      top_k: 3

pipelines:
  - name: query
    nodes:
      - name: BM25Retriever
        inputs: [Query]
      - name: EmbeddingRetriever
        inputs: [Query]
      - name: JoinResults
        inputs: [BM25Retriever, EmbeddingRetriever]
      - name: AnswerGen
        inputs: [JoinResults]

You can run this pipeline like this:

from haystack import Pipeline

p = Pipeline.load_from_yaml("drafts/pipeline.yaml")
results = p.run(query="What's the capital of France?")
print(results["answers"][0])

(https://docs.haystack.deepset.ai/docs/pipelines#yaml-file-definitions)

It assumes you already have a document store with some documents written in it. If you don't know how to do that, have a look at the docs (https://docs.haystack.deepset.ai/docs/pipelines#indexing-pipelines) or at the tutorials (for example https://haystack.deepset.ai/tutorials/01_basic_qa_pipeline)

Hope this helps!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to create pipeline for Re2G Architecture? #4578

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to create pipeline for Re2G Architecture? #4578

Uh oh!

Uh oh!

muazhari Apr 2, 2023

Replies: 1 comment

Uh oh!

ZanSara Apr 4, 2023

muazhari
Apr 2, 2023

ZanSara
Apr 4, 2023