This repository contains the implementation of a class project for the 10-708 Probabilistic Graphical Model course at Carnegie Mellon University. The goal of this project is to enhance the posterior approximation of NVIDIA's NVAE using a custom three-layer Restricted Boltzmann Machine (RBM) trained on MIMIC-CXR data.
This project builds on NVIDIA's Nouveau Variational Autoencoder (NVAE) to generate synthetic chest X-rays with enhanced latent representation, integrating radiological findings and anatomy correlations. The pipeline involves:
- Parsing MIMIC-CXR reports to extract structured labels.
- Training a custom RBM for posterior sampling.
- Using the RBM-enhanced posterior for training NVAE.
- Comparing NVAE results with those of a standard Variational Autoencoder (VAE).
- The project uses MIMIC-CXR JPEG images and corresponding radiology reports.
- The data should be placed in the directory:
./file-1024/path/to/study
. - Each radiology report should be a text file located in a nested structure (
p*/p*/s*.txt
).
Use script.sh
to extract structured fields (INDICATION
, TECHNIQUE
, COMPARISON
, FINDINGS
, IMPRESSION
, and generic report text) from radiology reports and save them as a CSV.
bash script.sh [-v] ./file-1024/path/to/study
-
Arguments:
-v
: Enables verbose mode for detailed logs.<directory_path>
: Path to the directory containing radiology reports.
-
Output:
output.csv
: A CSV file containing structured data extracted from all reports. The fields includeINDICATION
,TECHNIQUE
,COMPARISON
,FINDINGS
,IMPRESSION
, and any generic report text.script_log.txt
: A log file with details about the processing, including any warnings for missing files or headers.
-
Example Usage: To parse all reports in the directory and log detailed progress:
The threelayerRBM.py
script implements a custom three-layer Restricted Boltzmann Machine (RBM) for posterior sampling. It is trained using the structured data generated by script.sh
and stored in output.csv
.
- Run the script to preprocess the data and train the RBM:
python threelayerRBM.py
- Outputs:
rbm_model.pkl
: Trained RBM model.vectorizer_v.pkl
andhidden_vectorizers.pkl
: Vectorizers for encoding visible and hidden layer features.
Run the modified train.py script to train the NVAE using the RBM-enhanced posterior.
- Train NVAE with:
python train.py --root .
- Outputs:
- Model checkpoint in
eval-exp
directory. - Logs for TensorBoard visualization.
- Evaluate NVAE using the
NVAE_evaluation.ipynb
notebook. - For baseline comparisons, train and evaluate a normal VAE using
normalvae.ipynb
.
.
├── file-1024/path/to/study/ # Directory containing MIMIC-CXR JPEG and reports
├── output.csv # Processed radiology reports
├── script.sh # Script to parse reports into CSV
├── threelayerRBM.py # Three-layer RBM implementation
├── train.py # NVAE training script
├── NVAE_evaluation.ipynb # Notebook for NVAE evaluation
├── normalvae.ipynb # Notebook for baseline VAE training and evaluation
├── eval-exp/ # Directory for NVAE training outputs
└── README.md # Project documentation
Clone this repository:
git clone https://github.com/your-repo/project-name.git
cd project-name
Install dependencies:
pip install -r requirements.txt
Download and organize the MIMIC-CXR dataset in the directory structure:
./file-1024/path/to/study
- MIMIC-CXR Dataset: Johnson AEW, Pollard TJ, Shen L, et al. MIMIC-CXR.
- NVAE Implementation: Based on NVAE by NVIDIA.
- CMU 10-708 Course: Probabilistic Graphical Models, Fall 2024.