This repository contains the necessary information to replicate the WGCNA analysis performed in the article titled "Identification of modules and key genes associated with Breast Cancer subtypes through network analysis."
The repository is structured into the following directories:
-
Data: This section contains the datasets utilized in this study. The primary dataset (CCLE_expression) was obtained from the DepMap database version 21Q1 (https://depmap.org).
-
Script: Include the code employed for data analysis, encompassing data preparation, weighted gene co-expression network analysis (WGCNA), identification of central genes, functional annotation, and validation of central genes in an external dataset.
The data analysis workflow is depicted in Figure 1. Detailed explanations of each analysis step can be found in the Materials and Methods section of the manuscript. The key steps comprise:
-
Data preparation and processing: Selection of breast cancer (BC)-related cell lines and exclusion of the top 50% genes with the highest variability.
-
Weighted Gene Co-expression Network Analysis (WGCNA): Assessment of expression profiles of BC cell lines and genes to identify gene modules exhibiting similar expression patterns.
For more comprehensive information on the methods employed, please refer to the manuscript and the code scripts in the Scripts (code) directory.
If you wish to contribute to this project, kindly fork the repository and propose your changes via a pull request. All contributions are welcome and highly appreciated.
If you have any inquiries or comments regarding this project, please feel free to reach out.