Module : DGE with SCDE

This module calculate for each gene the probability of differential expression.

Internal name : scde
Avalaible : local mode
Input Ports :
- matrix : filtered count matrix (tsv)
- cells : filtered cells metadata (tsv)
- genes : genes metadata (tsv)
Output Ports :
- dgeoutput : differential expression result (tsv)
Optional parameters :
- Main Parameters

Parameter	Type	Description	Default Value
n.cores	int	Number of cores to use	1
model.group.col	string	Name of the column if cells must be grouped for model fitting (NULL if no grouping is necessary)	NULL
prior.length	int	Number of points for prior calculation	400
batch.col	string	Name of the column indicating batches, if batch correction is required (else NULL)	NULL
n.randomizations	int	Number of randomization for testing	150

Parameters for model fitting

Parameter	Type	Description	Default Value
min.observation	int	Minimal number of observations for a gene to be used for model fitting	3
min.genes	int	Minimum number of genes for model fitting	2000
threshold.segmentation	boolean	Use or not threshold segmentation to accelerate failure estimation	TRUE
failure.threshold	int	Number of reads indicating a gene failed amplification	4
max.pairs	int	Maximum number of comparisons that should be performed per group for estimation of dropout rate	5000
min.pairs	int	Minimum number of comparisons that should be performed per group for estimation of dropout rate	10
poisson.param	float	Parameter of the Poisson distribution used to model failures	0.1
linear.fit	boolean	Weither to use linear fit for model fitting (highly recommanded)	TRUE
min.theta	float	Minimum for the dispersion parameter of the negative binomial	0.01
max.theta	float	Maximum for the dispersion parameter of the negative binomial	100

Parameters for prior calculation

Parameter	Type	Description	Default Value
save.prior.plot	boolean	Weither to save or not prior plot	TRUE
pseudocount	int	Pseudocount to add to observation before log transforming them	1
quantile	float	Quantile used to set maximum expression value	0,999
max.value	float	Alternative to quantile, maximum expression value	NULL

Parameters for test

Parameter	Type	Description	Default Value
return.posteriors	boolean	Weither to return or not the posteriors	TRUE

Configuration example

<step id="DGE" skip="false">
	<module>scde</module>
	<parameters>
		<parameter>
			<name>prior.length</name>
			<value>400</value>	
		</parameter>
		<parameter>
			<name>n.randomizations</name>
			<value>200</value>	
		</parameter>
		<parameter>
			<name>n.cores</name>
			<value>12</value>	
		</parameter>
	  </parameters>
</step>

Interpreting output files

Introduction to scde

Evaluation of model fitting and misfit removal

In order to evaluate goodness of fit of the model, the module calculate the amount of variance from the measured values explained by the model (i.e. r-squared of a linear model where predictive value is the measured value and predicted value is the model value). The model is plotted on a goodness of fit plot :

FittingPlot

Considering that model would be fitted for the majority of cells, we expect the distribution of this values to be "nearly Gaussian" :

densityFiltered

Misfits are expected to be outliers showing lesser values (see distribution plot below), thus cells showing lesser value are removed one by one, until Shapiro's test returns sufficient probability under Gaussian assumption.

densityAll

Shapiro's P-values show an increase of several order of magnitudes after some removals (see p-value as a function of number of removals below). This increase indicates "nearly Gaussian" distribution.

probaPlot

The module also plot goodness of fit as a function of Michaelis Menten model maximum, before and after outliers depletion. This allow for visual inspection of the process.

qualityAll qualityFiltered

Scatter Plot

After cleaning data, the module produces two scatter plot, showing all cells in term of number of feature (y-axis) and number of reads (x-axis).

Raw_Cellplot

The first one, show all cells, the ones in red are those being eliminated.

Filtered_cellplot

The second one shows cells remaining after filtering. At the end of the filtering, cells should behave like a mixture of gaussian, i.e. you can wrap them in a given number of ellipses.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Module : DGE with SCDE

Module : DGE with SCDE

Interpreting output files

Introduction to scde

Evaluation of model fitting and misfit removal

Scatter Plot

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally