Skip to content

AMCeScience/feature-miner-pub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Feature miner

Developed and tested on Windows 10 inside a venv environment using Python 3.7.7 and pip 19.2.3

Setup your environment by installing the requirements using pip.

pip install -r requirements.txt

Copy the config.example file to config.py. A database with dummy data is provided here, place the file in /Database/miner_database.db. Data used in the research is available upon request (contact: a.j.vanaltena@amsterdamumc.nl) or may be collected from PubMed using the qrel files from the 2017 CLEF eHealth Lab. Follow the steps below to perform the experiments.

Setup

  1. Clean the raw articles
python clean_articles.py
  1. Build the feature matrices
python create_feature_matrices.py
  1. Do grid searches
python Grid_search/leaveoneout/rf_random_search.py
python Grid_search/onevsone/rf_random_search.py

Note: the results of the grid searches are placed in a csv file in the Grid_search/leaveoneout/ and Grid_search/onevsone/ directories respectively.

Experiments

Create a folder with the name of the experiment run and edit the CLASSIFIER_LOCATION in the config.py file. The config.example file uses the foldername run1.

  1. Run the classifiers
python run_leaveoneout.py
python run_onevsone.py
python run_nvsone.py
python run_nvsone_random.py

# Fetch timing difference results for two training set sizes
python run_nvsone_timing.py
  1. Interpret the outcomes

Note: for correlations calculation a metadata file is necessary. You may find this file for the fifty reviews used in our research here. For testing purposes we also provide a dummy set.

python make_plots.py
python calculate_correlations.py
  1. When writing paper
python prepare_metadata.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages