Skip to content

Hacettepe-University-CMP681-2020-Spring/ir-project-ir-term-project-omer-sahin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Query Reformulation by Keyword Selection

Query Reformulation model for selecting keywords that provide more precision on fetching relevant documents that trained in the manner of reinforcement learning.

Dataset

Jeopardy! TV Show

https://www.kaggle.com/tunguz/200000-jeopardy-questions

TREC - Complex Answer Retrieval (TREC-CAR)

http://trec-car.cs.unh.edu/

Files

Indexed Data

  • Indexed articles and queries
  • Word embedding matrix and word tokenizer
  • Search engine

https://drive.google.com/open?id=1xoquzwTFES00TFWYKkQ6KLhm7wlTGJtu

Trained Models

  • CNN, LSTM, BiLSTM and retrained CNN models

https://drive.google.com/open?id=1CT1HGvBhXMiTLeeZ6J6isxghHMylYBeM

Usage

  1. Index Dataset

  2. Train Model

    • Set search engine path in search.py
    • Set dataset path in train.py
    • Set output path of the model in train.py, select model network as CNN, LSTM or BiLSTM and run
  3. Evaluate

You can start any step if you have the required files.

About

ir-project-ir-term-project-omer-sahin created by GitHub Classroom

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages