Skip to content

Vlad-2299/InformationRetriever

Repository files navigation

InformationRetriever

Cosmology-Based Question & Answer System

alt text

Elastic Search Local Server

Ensure to install https://www.elastic.co/pt/elasticsearch/ and run as administrator, before executing the system;
Used version==7.11.2

PDF_Reader.ipynb

Reads and processes all the PDF files inside RawPapers. Outputs a TXT file into ProcessedData, with the text content of the PDF file.

Web_Scraper.ipynb

Reads and processes all the HTML pages inside RawWebPages. Outputs a TXT file into ProcessedData, with the text content of the web page.

DataStore.ipynb

Ensures that a connection with ElasticSearch is established. Reads the saved content in ProcessedData, and for each passage, creates a JSON document which will be used to populate the DocumentStore.

Retriever_Reader.ipynb

Deploys three off-the-shelf models: Pegasus (Question reformulator), BM25Retriever (Document retriever), RoBERTa (Document reader)

About

Cosmology-Based Question & Answer System

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published