@@ -169,10 +169,14 @@ DESeq2 Analysis Workflow
169
169
| read_distribution| 5.0.4| [ http://rseqc.sourceforge.net/#read-distribution-py ] ( http://rseqc.sourceforge.net/#read-distribution-py ) |
170
170
| R| 4.4.2| [ https://www.r-project.org/ ] ( https://www.r-project.org/ ) |
171
171
| Bioconductor| 3.20| [ https://bioconductor.org ] ( https://bioconductor.org ) |
172
+ | BiocParallel| 1.40.0| [ https://bioconductor.org/packages/release/bioc/html/BiocParallel.html ] ( https://bioconductor.org/packages/release/bioc/html/BiocParallel.html ) |
172
173
| DESeq2| 1.46.0| [ https://bioconductor.org/packages/release/bioc/html/DESeq2.html ] ( https://bioconductor.org/packages/release/bioc/html/DESeq2.html ) |
173
174
| tximport| 1.34.0| [ https://github.com/mikelove/tximport ] ( https://github.com/mikelove/tximport ) |
174
175
| tidyverse| 2.0.0| [ https://www.tidyverse.org ] ( https://www.tidyverse.org ) |
176
+ | dplyr| 1.1.4| [ https://dplyr.tidyverse.org/ ] ( https://dplyr.tidyverse.org/ ) |
177
+ | knitr| 1.49| [ https://yihui.org/knitr/ ] ( https://yihui.org/knitr/ ) |
175
178
| stringr| 1.5.1| [ https://github.com/tidyverse/stringr ] ( https://github.com/tidyverse/stringr ) |
179
+ | yaml| 2.3.10| [ https://github.com/yaml/yaml ] ( https://github.com/yaml/yaml ) |
176
180
| dp_tools| 1.3.5| [ https://github.com/J-81/dp_tools ] ( https://github.com/J-81/dp_tools ) |
177
181
| pandas| 2.2.3| [ https://github.com/pandas-dev/pandas ] ( https://github.com/pandas-dev/pandas ) |
178
182
| seaborn| 0.13.2| [ https://seaborn.pydata.org/ ] ( https://seaborn.pydata.org/ ) |
@@ -1320,7 +1324,7 @@ txi.rsem$length[txi.rsem$length == 0] <- 1
1320
1324
```
1321
1325
1322
1326
** Input Data:**
1323
-
1327
+ * * genes.results (RSEM counts per gene, output from [ Step 8a ] ( #8a-count-aligned-reads-with-rsem ) or from [ Step 8dii ] ( #8dii-filter-rrna-genes-from-rsem-genes-results ) when using rRNA-removed count data)
1324
1328
* ` study ` (data frame containing sample condition values, output from [ Step 9c] ( #9c-create-study-group-and-contrasts ) )
1325
1329
1326
1330
** Output Data:**
@@ -1440,7 +1444,7 @@ res_lrt <- results(dds_lrt)
1440
1444
** Output Data:**
1441
1445
1442
1446
* ` sampleTable ` (data frame mapping samples to groups)
1443
- * ` dds ` (DESeq2 data object containing normalized counts and experimental design)
1447
+ * ` dds ` (DESeq2 data object containing normalized counts, experimental design, and differential expression results )
1444
1448
* ` normCounts ` (data frame of normalized count values + 1)
1445
1449
* ` VSTCounts ` (data frame of variance stabilized transformed counts)
1446
1450
* ` dds_lrt ` (DESeq2 data object from likelihood ratio test)
@@ -1455,24 +1459,21 @@ res_lrt <- results(dds_lrt)
1455
1459
# ## Initialize output table with normalized counts ###
1456
1460
output_table <- tibble :: rownames_to_column(normCounts , var = " ENSEMBL" )
1457
1461
1458
- # ## Add LRT p-values ###
1459
- output_table $ LRT.p.value <- res_lrt @ listData $ padj
1460
-
1461
1462
# ## Iterate through Wald Tests to generate pairwise comparisons of all groups ###
1462
1463
compute_contrast <- function (i ) {
1463
- res_1 <- results(
1464
- dds_1 ,
1464
+ res <- results(
1465
+ dds ,
1465
1466
contrast = c(" condition" , contrasts [1 , i ], contrasts [2 , i ]),
1466
1467
parallel = FALSE # Disable internal parallelization
1467
1468
)
1468
- res_1_df <- as.data.frame(res_1 @ listData )[, c(2 , 4 , 5 , 6 )]
1469
- colnames(res_1_df ) <- c(
1469
+ res_df <- as.data.frame(res @ listData )[, c(2 , 4 , 5 , 6 )]
1470
+ colnames(res_df ) <- c(
1470
1471
paste0(" Log2fc_" , colnames(contrasts )[i ]),
1471
1472
paste0(" Stat_" , colnames(contrasts )[i ]),
1472
1473
paste0(" P.value_" , colnames(contrasts )[i ]),
1473
1474
paste0(" Adj.p.value_" , colnames(contrasts )[i ])
1474
1475
)
1475
- return (res_1_df )
1476
+ return (res_df )
1476
1477
}
1477
1478
1478
1479
# ## Use bplapply to compute results in parallel ###
@@ -1487,7 +1488,7 @@ output_table <- cbind(output_table, res_df)
1487
1488
# ## Add summary statistics ###
1488
1489
output_table $ All.mean <- rowMeans(normCounts , na.rm = TRUE )
1489
1490
output_table $ All.stdev <- rowSds(as.matrix(normCounts ), na.rm = TRUE )
1490
- output_table $ LRT.p.value <- res_1_lrt @ listData $ padj
1491
+ output_table $ LRT.p.value <- res_lrt @ listData $ padj
1491
1492
1492
1493
# ## Add group-wise statistics ###
1493
1494
tcounts <- as.data.frame(t(normCounts ))
@@ -1532,6 +1533,7 @@ output_table <- output_table %>%
1532
1533
* ` normCounts ` (data frame of normalized counts, output from [ Step 9e] ( #9e-perform-dge-analysis ) )
1533
1534
* ` res_lrt ` (results object from likelihood ratio test, output from [ Step 9e] ( #9e-perform-dge-analysis ) )
1534
1535
* ` contrasts ` (matrix defining pairwise comparisons, output from [ Step 9c] ( #9c-configure-metadata-sample-grouping-and-group-comparisons ) )
1536
+ * ` dds ` (DESeq2 data object containing normalized counts, experimental design, and differential expression results, output from [ Step 9e] ( #9e-perform-dge-analysis ) )
1535
1537
* ` annotations_link ` (variable containing URL to GeneLab annotation table, output from [ Step 9b] ( #9b-environment-set-up ) )
1536
1538
1537
1539
** Output Data:**
@@ -1604,6 +1606,7 @@ sessionInfo()
1604
1606
* ** contrasts_GLbulkRNAseq.csv** (table listing all pairwise group comparisons)
1605
1607
* ** differential_expression_GLbulkRNAseq.csv** (DGE results table containing the following columns:
1606
1608
- Gene identifier column (ENSEMBL or TAIR for plant studies)
1609
+ - Additional organism-specific gene annotations columns
1607
1610
- Normalized counts
1608
1611
- For each pairwise group comparison:
1609
1612
- Log2 fold change
@@ -1615,8 +1618,7 @@ sessionInfo()
1615
1618
- LRT.p.value (likelihood ratio test adjusted p-value)
1616
1619
- For each group:
1617
1620
- Group.Mean_ (group) (mean within group)
1618
- - Group.Stdev_ (group) (standard deviation within group)
1619
- - Additional organism-specific gene annotations columns)
1621
+ - Group.Stdev_ (group) (standard deviation within group))
1620
1622
1621
1623
> Note: Datasets with technical replicates are handled by collapsing them such that the minimum number of equal technical replicates is retained across all samples. Before normalization, the counts of technical replicates are summed to combine them into a single sample representing the biological replicate.
1622
1624
0 commit comments