Multi-View Learning for Addressing Unbalanced/Partial Data Modalities in Glioblastoma Survival Prediction
Glioblastoma (GBM) is an aggressive form of brain tumor with a poor prognosis. Accurate survival prediction is crucial for personalized treatment planning. Our study leverages multi-modal learning techniques to handle missing or unbalanced data modalities in Glioblastoma survival prediction.
Survival prediction in Glioblastoma patients is challenging due to:
- Heterogeneous data: Imaging, pathology, molecular, and clinical features vary across patients.
- Missing modalities: Not all patients have complete data, making traditional deep learning models unreliable.
- Interpretability issues: Black-box models lack transparency, reducing their adoption in clinical settings.
We propose a multi-view learning framework that integrates incomplete data modalities to enhance survival prediction. Our methodology includes:
- Data preprocessing and feature extraction across MRI, histopathology, and clinical data.
- Multi-modal deep learning models trained on incomplete data.
- Interpretability techniques for model transparency and clinical trust.
We utilized publicly available datasets from The Cancer Imaging Archive (TCIA):
-
MRI & Radiomic Data: Download (69GB, 630 participants, 10645 files in NIfTI format)
- Includes MRI scans, segmented tumor images, and radiomic features.
- Additional details: DOI Link
-
Histopathology Images: Download (34 patients, 71 images)
- Multi-parametric MRI scans from UPENN-GBM collection.
- More details: DOI Link
- Converted DICOM to NIfTI for MRI images.
- Extracted tumor-specific features using radiomics principles.
- Tiled histopathology images into smaller patches to improve computational efficiency.
- Performed Exploratory Data Analysis (EDA) on clinical data, handling missing values and feature correlations.
- The multi-view model performed better than single-modality models in survival prediction.
- The interpretability approach helped identify key biomarkers for prognosis.
- Future work includes expanding datasets and improving generalization.
Our work is supported by recent research papers: