Information Maximization-Based Clustering of Histopathology Images Using Deep Learning

This is the repository that has the code and figures for the paper titled Information Maximization-Based Clustering of Histopathology Images Using Deep Learning.

128, which is a folder contains models (in folder models14) for 14-cluster set using 128x128 pixels patches, script for training this particular model (part1_kpc_cae_128_14.ipynb), clustering output using this model (part2_clustering.ipynb) and umap plotting (part3_umap.ipynb) using upper-dimensional latent features obtained from this model. Similarly, Folder 64 includes models (in folder models14) for 14-cluster set using 64x64 pixels patches, script for training this particular model (part1_kpc_cae_64_14.ipynb), clustering output using this model (part2_clustering.ipynb) and umap plotting (part3_umap.ipynb) using upper-dimensional latent features attained from this model.

We have only shown these two models because they have been designated as the optimal cluster sets by internal validation metrics. By changing in one place, the scripts can be run for cluster sets using 8, 9, 10, 11, 12, 13, 15, 16, 17 and 18 as well. We mentioned where to change in the training scipts (part1_kpc_cae_128_14.ipynb and part1_kpc_cae_64_14.ipynb). Check that out.

'Figures' (another folder) has every figure used in the manuscript (including Supporting information segment). The Supporting information figures shown in the paper can be found in the Additional Information subfolder here. Besides, some high resolution WSIs can also be found in the Figures folder.
'MODULE' folder has been used to calculate internal validation indices like Xie-Beni index, Calinski-Harabasz index, C index, Dunn index, Hartigan index and Mclain-Rao index for both patch sizes (128x128 and 64x64 pixels). Go through these two scripts: Internal_validation_128x128.ipynb and Internal_validation_64x64.ipynb to see index calculations.
internal_validation_indices_plot.ipynb has the code for creating Figure 5 as shown in the manuscript.

sequential_patch_creation_from_1_WSI.ipynb stores the code for creating 18252 patches sequentially (without overlap) of size 128x128 pixels from a randomly selected WSI with 5 stainings. We could not upload the dataset (seq_patches_18252_from_1wsi.npy) in GitHub because it has a size in Gigabytes but it can be found by accessing this link: seq_patches_18252_from_1wsi.npy. In this script, we have visualized the WSI that was used for sequential patch creation with 5 stainings. Additionally, we have shown a patch from this dataset with 5 stainings. A similar procedure can be followed to create 64x64 patches with minor changes in the script but we did not do that in this research.

In sequential_patch_creation_from_1wsi.ipynb, you will notice HE_2, HE_3 and HE_4 besides HE_1. These HE series were taken for alignment and registration of images. The original tissue was sliced into a series of thin slices. Then, they were stained with HE and other staining methods alternately, e.g., HE-MT-HE-CD31-HE-Ki67-HE-CK...

So, all stained images should be sandwiched by two contacts of HE images.

IM_output_64x64_100_1.ipynb and IM_output_of_unused_samples_during_training.ipynb possess the code to generate Figure 4 and Figure 5, respectively of the Supporting information portion in the manuscript.
original_transformed_128x128.ipynb has the code to visualize the original and transformed versions of a patch (128x128 pixels). Look over Figure 2 of the Supporting information division in the paper.
the_learning_curve.ipynb contains the code to create Figure 3 of the Supporting information segment in the paper.

patch_visualization_128x128_64x64.ipynb has the code to create Figure 1 of the manuscript. In addition, we have shown first 20 samples of both datasets (slice128_Block2_20K.npy ---> contains 128x128 pixels patches and 'slice64_Block2_20K.npy' ---> stores 64x64 pixels patches) with 5 stainings. The sizes of slice128_Block2_20K.npy and slice64_Block2_20K.npy are not suitable for being uploaded in GitHub (Gigabyte files).

But, these two datasets can be found in this link: Location of 128x128 and 64x64 patches

histogram_colormap_WSI.ipynb keeps the code to create Figure 10 of the manuscript.

`filename_tif8_block2.txt` contains the addresses of 191 WSIs.

By using random_patch_creation.ipynb script, randomly patches can be created from 191 WSIs. Here, we have presented the code for creating 128x128 patches but with minor changes, 64x64 patches can also be created. We indicated where to change in the script to create 64x64 patches.

You will observe some of the locations in different scipts of this repository that points to the placement of different types of files (either dataset or something else) as shown below:

/project/dsc-is/mahfujul-r/M/slice128_Block2.20K.ipynb

models14/model_en_202211280036_4000.ckpt

models14/model_cl_202211280036_4000.ckpt

models14/model_de_202211280036_4000.ckpt

project/dsc-is/mahfujul-r/M/128/14/HHH14 & C14 & D14.csv

/project/dsc-is/mahfujul-r/M/slice64_Block2.20K.ipynb

models14/model_encoder_3000

models14/model_classifier_3000

models14/model_decoder_3000

project/dsc-is/mahfujul-r/M/64/14/HHH14 & C14 & D14.csv

./64/models14/model_encoder_3000

./64/models14/model_classifier_3000

./64/models14/model_decoder_3000

./128/models14/model_en_202211280036_4000.ckpt

./128/models14/model_cl_202211280036_4000.ckpt

./128/models14/model_de_202211280036_4000.ckpt

project/dsc-is/mahfujul-r/M/128/08/HHH08 & C08 & D08.csv

project/dsc-is/mahfujul-r/M/128/09/HHH09 & C09 & D09.csv

project/dsc-is/mahfujul-r/M/128/10/HHH10 & C10 & D10.csv

project/dsc-is/mahfujul-r/M/128/11/HHH11 & C11 & D11.csv

project/dsc-is/mahfujul-r/M/128/12/HHH12 & C12 & D12.csv

project/dsc-is/mahfujul-r/M/128/13/HHH13 & C13 & D13.csv

project/dsc-is/mahfujul-r/M/128/15/HHH15 & C15 & D15.csv

project/dsc-is/mahfujul-r/M/128/16/HHH16 & C16 & D16.csv

project/dsc-is/mahfujul-r/M/128/17/HHH17 & C17 & D17.csv

project/dsc-is/mahfujul-r/M/128/18/HHH18 & C18 & D18.csv

project/dsc-is/mahfujul-r/M/64/08/HHH08 & C08 & D08.csv

project/dsc-is/mahfujul-r/M/64/09/HHH09 & C09 & D09.csv

project/dsc-is/mahfujul-r/M/64/10/HHH10 & C10 & D10.csv

project/dsc-is/mahfujul-r/M/64/11/HHH11 & C11 & D11.csv

project/dsc-is/mahfujul-r/M/64/12/HHH12 & C12 & D12.csv

project/dsc-is/mahfujul-r/M/64/13/HHH13 & C13 & D13.csv

project/dsc-is/mahfujul-r/M/64/15/HHH15 & C15 & D15.csv

project/dsc-is/mahfujul-r/M/64/16/HHH16 & C16 & D16.csv

project/dsc-is/mahfujul-r/M/64/17/HHH17 & C17 & D17.csv

project/dsc-is/mahfujul-r/M/64/18/HHH18 & C18 & D18.csv

./64/models14/hist_modelS_3000

You have to change these locations according to your working environment.

This repository uses `MIT License`. Read the terms and conditions from LICENSE text file.

Read the paper from here: KPC_IM

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Information Maximization-Based Clustering of Histopathology Images Using Deep Learning

`filename_tif8_block2.txt` contains the addresses of 191 WSIs.

You will observe some of the locations in different scipts of this repository that points to the placement of different types of files (either dataset or something else) as shown below:

You have to change these locations according to your working environment.

This repository uses `MIT License`. Read the terms and conditions from LICENSE text file.

Read the paper from here: KPC_IM

About

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 161 Commits
128		128
64		64
Figures		Figures
MODULE		MODULE
IM_output_64x64_100_1.ipynb		IM_output_64x64_100_1.ipynb
IM_output_of_unused_samples_during_training.ipynb		IM_output_of_unused_samples_during_training.ipynb
Internal_validation_128x128.ipynb		Internal_validation_128x128.ipynb
Internal_validation_64x64.ipynb		Internal_validation_64x64.ipynb
LICENSE		LICENSE
README.md		README.md
filename_tif8_block2.txt		filename_tif8_block2.txt
histogram_colormap_WSI.ipynb		histogram_colormap_WSI.ipynb
internal_validation_indices_plot.ipynb		internal_validation_indices_plot.ipynb
original_transformed_128x128.ipynb		original_transformed_128x128.ipynb
patch_visualization_128x128_64x64.ipynb		patch_visualization_128x128_64x64.ipynb
random_patch_creation.ipynb		random_patch_creation.ipynb
sequential_patch_creation_from_1_WSI.ipynb		sequential_patch_creation_from_1_WSI.ipynb
the_learning_curve_64x64.ipynb		the_learning_curve_64x64.ipynb

License

randomaccess2023/KPC_IM_128_64

Folders and files

Latest commit

History

Repository files navigation

Information Maximization-Based Clustering of Histopathology Images Using Deep Learning

filename_tif8_block2.txt contains the addresses of 191 WSIs.

You will observe some of the locations in different scipts of this repository that points to the placement of different types of files (either dataset or something else) as shown below:

You have to change these locations according to your working environment.

This repository uses MIT License. Read the terms and conditions from LICENSE text file.

Read the paper from here: KPC_IM

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages

`filename_tif8_block2.txt` contains the addresses of 191 WSIs.

This repository uses `MIT License`. Read the terms and conditions from LICENSE text file.