Feature miner

Developed and tested on Windows 10 inside a venv environment using Python 3.7.7 and pip 19.2.3

Setup your environment by installing the requirements using pip.

pip install -r requirements.txt

Copy the config.example file to config.py. A database with dummy data is provided here, place the file in /Database/miner_database.db. Data used in the research is available upon request (contact: a.j.vanaltena@amsterdamumc.nl) or may be collected from PubMed using the qrel files from the 2017 CLEF eHealth Lab. Follow the steps below to perform the experiments.

Setup

Clean the raw articles

python clean_articles.py

Build the feature matrices

python create_feature_matrices.py

Do grid searches

python Grid_search/leaveoneout/rf_random_search.py
python Grid_search/onevsone/rf_random_search.py

Note: the results of the grid searches are placed in a csv file in the Grid_search/leaveoneout/ and Grid_search/onevsone/ directories respectively.

Experiments

Create a folder with the name of the experiment run and edit the CLASSIFIER_LOCATION in the config.py file. The config.example file uses the foldername run1.

Run the classifiers

python run_leaveoneout.py
python run_onevsone.py
python run_nvsone.py
python run_nvsone_random.py

# Fetch timing difference results for two training set sizes
python run_nvsone_timing.py

Interpret the outcomes

Note: for correlations calculation a metadata file is necessary. You may find this file for the fifty reviews used in our research here. For testing purposes we also provide a dummy set.

python make_plots.py
python calculate_correlations.py

When writing paper

python prepare_metadata.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Feature miner

Setup

Experiments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Correlations		Correlations
Database		Database
Grid_search		Grid_search
Libs		Libs
Plots		Plots
Preprocessing		Preprocessing
Text_data		Text_data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
calculate_correlations.py		calculate_correlations.py
clean_articles.py		clean_articles.py
config.example		config.example
create_feature_matrices.py		create_feature_matrices.py
make_plots.py		make_plots.py
prepare_metadata.py		prepare_metadata.py
requirements.txt		requirements.txt
run_leaveoneout.py		run_leaveoneout.py
run_nvsone.py		run_nvsone.py
run_nvsone_random.py		run_nvsone_random.py
run_nvsone_timing.py		run_nvsone_timing.py
run_onevsone.py		run_onevsone.py

License

AMCeScience/feature-miner-pub

Folders and files

Latest commit

History

Repository files navigation

Feature miner

Setup

Experiments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages