Developed and tested on Windows 10 inside a venv
environment using Python 3.7.7 and pip 19.2.3
Setup your environment by installing the requirements using pip.
pip install -r requirements.txt
- Copy the
config.example
file toconfig.py
.
cp config.example config.py
- Run
fetch_raw.py
to retrieve the data for the PubMed articles defined in the qrel files from the 2017 CLEF eHealth Lab.
python fetch_raw.py
- Run
insert_docs.py
to insert the fetched article data from PubMed into a local database.
python insert_docs.py
- Run
fetch_validity.py
to check the database against the original qrel files.
python fetch_validity.py
- Run
clean_docs.py
to preprocess the articles and store them as a feature matrix.
python clean_docs.py
- Run
run_experiments.py
to determine the performance for baseline (using all data) and selected datasets.
python run_experiments.py
- Run
result_analysis.py
to create plots and significance tests.
python result_analysis.py