Update my-vignette.Rmd

tanlabcode · web-flow · commit ec51f72788b5 · 2019-01-11T12:36:15.000-05:00
diff --git a/R/vignettes/my-vignette.Rmd b/R/vignettes/my-vignette.Rmd
@@ -28,7 +28,7 @@ library(SCRABBLE)
 # Simulation Data Study
 ## Data Generation
 
-The data set was generated from down sampling from bulk RNAseq data. We used the bulk RNA-Seq data set of mouse hair follicles (GSE85039). In total, the dataset contains 20 different combinations of anatomic sites and developmental time points, thus constituting a high dimensional measurement space. We used the following procedures to generate the drop-out datasets. 1) We selected 732 genes that are differentially expressed in the 20 conditions based on ANOVA analysis. 2) We randomly selected 10 out of the 20 conditions. 3) For each condition, we generated 100 resampled datasets. The means and standard deviations of genes were calculated for each condition based on the 100 resampled datasets. 4) 100 new datasets were generated based on the mean and the standard deviation of each gene. 5) The final data set was obtained by combining 1000 samples representing the 10 conditions. This 1000x732 matrix now represents 1000 cells and 732 genes. 6) we make the drop-out rate of each gene in each cell following a double exponential function . Zero values are introduced into the simulated data for each gene in each cell based on the Bernoulli distribution defined by the corresponding drop-out rate.
+We simulated scRNA-Seq data consisting of three cell types using the Splat method in the Bioconductor package Splatter 8. Dataset consists of 800 genes and 1000 cells. The details of the parameters used in the simulation data generation are seen in Supplementary Table 2. Dropout midpoints (parameter dropout_mid in Splatter) are used to control the dropout rate in the simulated data. Splatter generates the data matrices for the true data and its corresponding dropout data using a given dropout midpoint. The corresponding bulk RNA-Seq data are the mean values of genes in the true scRNA-Seq data. The dropout RNA-Seq and bulk RNA-Seq data matrices are the inputs of the imputation methods. To determine the performance stability of the methods, we generated 100 datasets for each dropout midpoints. 
 
 ### Load the data
 
@@ -52,13 +52,12 @@ SCRABBLE imputes drop-out data by optimizing an objective function that consists
 
 ### Set up the parameter used in SCRABBLE
 ```{r}
-parameter <- c(10,1e-5,1e-4)
-nIter <- 30
+parameter <- c(1,1e-6,1e-4)
 ```
 
 ### Run SCRABLE
 ```{r}
-result <- scrabble(data, parameter = parameter, nIter = nIter, error_out_threshold = 1e-05, nIter_inner = 30, error_inner_threshold = 1e-05)
+result <- scrabble(data, parameter = parameter)
 ```
 
 ### Plot the data