CSE517 Course Project: PromptEHR

Installation

To seperate the packages for this project from other python environments on the system, create a new conda environment. Our scripts were all based on python 3.9:

conda create --name cse517 python=3.9
conda activate cse517

Install the base package requirements:

pip install -r requirements.txt

Install pytorch with pip by following the instructions on the Pytorch website, the command should look something like:

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

The PyTrial package should have been installed with the requirements.txt file. PyTrial contains the most up-to-date code for PromptEHR. In case the package wasn't properly installed previously:

pip install pytrial

Full documentation of the group's setup process can be found here.

Preprocessing

More information about converting the CSV files from the MIMIC-III dataset into the files required for training can be found in this markdown file.

Training

To train the model using the hyperparameters from the PromptEHR paper, simply run the training script:

python train.py

Some parameters such as the number of epochs, batch size, number of training samples, and evaluation frequency can be updated by changing the constants definined in the train.py file.

Evaluation

The code to evaluate the perplexity, privacy, and utility of the models is in the evaluate.ipynb notebook file. This file assumes there is a fully trained model in the folder ./model_50_epochs_30k_samples and a partially trained model in the folder ./model_20_epochs_15k_samples. These folders are ignored by git as they are too large to push to the repository.

Computational Requirements:

We trained the model using an NVIDIA V100 GPU on Google Cloud Platform for 5 epochs. The original paper requires 251 GB of RAM to train on the whole dataset with 16 epochs so you might want to think about that before hand on platforms where you can find the right GPUs.

Useful Links

CSE 517 Project Instructions
PromptEHR Paper
PromptEHR GitHub
Reproducibility Report

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data_conversion		data_conversion
.gitignore		.gitignore
README.md		README.md
demo.ipynb		demo.ipynb
environment.yaml		environment.yaml
evaluate.ipynb		evaluate.ipynb
load_data.py		load_data.py
requirements.txt		requirements.txt
setup.md		setup.md
test.ipynb		test.ipynb
test.py		test.py
test_torch.py		test_torch.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CSE517 Course Project: PromptEHR

Installation

Preprocessing

Training

Evaluation

Computational Requirements:

Useful Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

ponto-n/CSE517_Project_PromptEHR

Folders and files

Latest commit

History

Repository files navigation

CSE517 Course Project: PromptEHR

Installation

Preprocessing

Training

Evaluation

Computational Requirements:

Useful Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages