Skip to content

Commit ec51f72

Browse files
authored
Update my-vignette.Rmd
1 parent 186003b commit ec51f72

File tree

1 file changed

+3
-4
lines changed

1 file changed

+3
-4
lines changed

R/vignettes/my-vignette.Rmd

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ library(SCRABBLE)
2828
# Simulation Data Study
2929
## Data Generation
3030

31-
The data set was generated from down sampling from bulk RNAseq data. We used the bulk RNA-Seq data set of mouse hair follicles (GSE85039). In total, the dataset contains 20 different combinations of anatomic sites and developmental time points, thus constituting a high dimensional measurement space. We used the following procedures to generate the drop-out datasets. 1) We selected 732 genes that are differentially expressed in the 20 conditions based on ANOVA analysis. 2) We randomly selected 10 out of the 20 conditions. 3) For each condition, we generated 100 resampled datasets. The means and standard deviations of genes were calculated for each condition based on the 100 resampled datasets. 4) 100 new datasets were generated based on the mean and the standard deviation of each gene. 5) The final data set was obtained by combining 1000 samples representing the 10 conditions. This 1000x732 matrix now represents 1000 cells and 732 genes. 6) we make the drop-out rate of each gene in each cell following a double exponential function . Zero values are introduced into the simulated data for each gene in each cell based on the Bernoulli distribution defined by the corresponding drop-out rate.
31+
We simulated scRNA-Seq data consisting of three cell types using the Splat method in the Bioconductor package Splatter 8. Dataset consists of 800 genes and 1000 cells. The details of the parameters used in the simulation data generation are seen in Supplementary Table 2. Dropout midpoints (parameter dropout_mid in Splatter) are used to control the dropout rate in the simulated data. Splatter generates the data matrices for the true data and its corresponding dropout data using a given dropout midpoint. The corresponding bulk RNA-Seq data are the mean values of genes in the true scRNA-Seq data. The dropout RNA-Seq and bulk RNA-Seq data matrices are the inputs of the imputation methods. To determine the performance stability of the methods, we generated 100 datasets for each dropout midpoints.
3232

3333
### Load the data
3434

@@ -52,13 +52,12 @@ SCRABBLE imputes drop-out data by optimizing an objective function that consists
5252

5353
### Set up the parameter used in SCRABBLE
5454
```{r}
55-
parameter <- c(10,1e-5,1e-4)
56-
nIter <- 30
55+
parameter <- c(1,1e-6,1e-4)
5756
```
5857

5958
### Run SCRABLE
6059
```{r}
61-
result <- scrabble(data, parameter = parameter, nIter = nIter, error_out_threshold = 1e-05, nIter_inner = 30, error_inner_threshold = 1e-05)
60+
result <- scrabble(data, parameter = parameter)
6261
```
6362

6463
### Plot the data

0 commit comments

Comments
 (0)