-
Notifications
You must be signed in to change notification settings - Fork 0
Description
I hope this message finds you well. I have been studying your work on CellCLIP and find it very insightful. I have a question regarding the description of the CPJUMP1 dataset used in your experiments, and I was hoping you could provide some clarification.
In your paper, you mention that for the replicate detection & sister perturbation matching tasks, you employ the CPJUMP1 dataset. You describe it as follows:
"...which features 186,925 nine-channel microscopy images. These images include three bright-field channels in addition to six Cell Painting dye channels."
However, when I consulted the original source for CPJUMP1 (Chandrasekaran et al., 2024, Nature Methods) and its corresponding data repositories on GitHub and AWS, I noticed a few potential discrepancies:
Total Image Count: The original dataset is described as containing approximately 3 million images in total, with around 2.7 million single-channel images being publicly available. This number is different from the 186,925 9-channels images mentioned.
Channel Composition: The source paper details a composition of five fluorescent dye channels (ch01-ch05) and three bright-field channels (ch06-ch08), making it an eight-channel dataset, not nine-channel.
Data Verification: To confirm, I downloaded and processed the publicly available single-channel images. After merging them by site, I was able to generate approximately 340,000 eight-channel images. This aligns with the 8-channel description from the source but differs from the 9-channel description and the total image count in your paper.
Given this, I was wondering if the description in your paper refers to a specific, curated subset of the CPJUMP1 dataset that you created for your experiments, or if there might be a small typo in the number of channels (nine vs. eight) and the total image count?
Any clarification you could provide would be greatly appreciated. Thank you for your time and for your valuable contribution to the field.
Best regards,
Anny
