SCALPEL: A Nextflow-based Pipeline for Isoform Quantification at Single-Cell Resolution

About the Project

SCALPEL is a robust pipeline designed for transcript isoform quantification and alternative polyadenylation (APA) characterization using 3'-tagged single-cell RNA-seq (scRNA-seq) data. Built with Nextflow, it integrates multiple processing steps including read quantification, APA annotation, and isoform usage analysis.

Installation Options

Prerequisite:

Nextflow>=v24.10.6 : Official page / CONDA

SCALPEL can be installed and run using one of the following options:

Option 1: Using Conda environment

Clone the repository

> git clone https://github.com/p-CMRC-LAB/SCALPEL.git

Run SCALPEL using CONDA yml file

> nextflow run -resume SCALPEL/main.nf \
  --sequencing chromium \
  --samplesheet path/to/samplesheet.csv \
  --transcriptome path/to/gencode.transcripts.fa \
  --gtf path/to/gencode.annotation.gtf \
  --ipdb path/to/mm10.polyA.track \
  --cpus 40 \
  -with-conda SCALPEL/requirements.yml

or create CONDA environment and activate

> conda env create --file SCALPEL/requirements.yml
> conda activate scalpelEnv
> nextflow run -resume main.nf \
  --sequencing chromium \
  --samplesheet path/to/samplesheet.csv \
  --transcriptome path/to/gencode.transcripts.fa \
  --gtf path/to/gencode.annotation.gtf \
  --cpus 40 \
  --ipdb path/to/mm10.polyA.track

Option 2: Using Apptainer container

You can download a prebuilt Apptainer container with all SCALPEL dependencies from the following link:
Download SCALPEL Container

Download the container and clone the repository

wget https://data.cyverse.org/dav-anon/iplant/home/franzx5/SCALPEL.container.sif
git clone https://github.com/p-CMRC-LAB/SCALPEL.git

Run SCALPEL using the container

nextflow run SCALPEL/main.nf \
  -with-apptainer SCALPEL.container.sif \
  --sequencing chromium \
  --samplesheet path/to/samplesheet.csv \
  --transcriptome path/to/gencode.transcripts.fa \
  --gtf path/to/gencode.annotation.gtf \
  --ipdb path/to/mm10.polyA.track

Required Input Files

Parameter	Description
`--samplesheet`	CSV with sample names and paths to FASTQ/BAM/CellRanger output
`--transcriptome`	FASTA of reference transcriptome
`--gtf`	GTF annotation file
`--ipdb`	Internal priming annotation file
`--barcodes`	(Optional) Barcode whitelist per sample
`--clusters`	(Optional) Tab-delimited file with cell-to-cluster mappings
`--sequencing`	Must be `chromium` or `dropseq`

Reference files:

Output Files and Execution Notes

After execution, SCALPEL generates a results/ directory containing key outputs for downstream analysis.

Key Output Files

File / Pattern	Description
`*_filtered.bam`	BAM files with deduplicated reads excluding internal priming artifacts.
`*_filtered.bam.bai`	BAM index files.
`*_APADGE.txt`	APA-aware isoform-level expression matrix per sample.
`*_seurat.RDS`	Seurat object per sample.
`iDGE_seurat.RDS`	Merged Seurat object across all samples.
`DIU_table.csv`	Differential isoform usage table.
`Runfiles/`	Execution logs and process metadata.

Notes

Output filenames are prefixed by the sample name.
Seurat .RDS files are ready for downstream visualization and clustering in R.
*_APADGE.txt matrices are compatible with other statistical environments.

For downstream analysis tutorials, visit:

Customizing Execution with `nextflow.config`

To modify resource usage and process settings, optionally, you can edit the nextflow.config file. For example:

/* Define Nextflow configuration settings for SCALPEL pipeline execution */
/* ===================================================================== */

/* -> Processes */

/* Enter here the desired Nextflow parameters for execution (see https://www.nextflow.io/docs/latest/index.html)*/

params {
    // General parameters
    workDir = './work'          // Directory for Nextflow work files
    process.executor = 'local'  // Default executor, can be overridden in the executor block
    process.queue = 'default'   // Default queue, can be overridden in the executor block

    //=====1. Specific parameters for SCALPEL pipeline====
    outputDir = "./results" // Directory for output files

    // 1a. Reads:
    samplesheet = null
    sequencing = null
    barcodes = null
    clusters = null

    // 1b. Annotation:
    transcriptome = null
    gtf = null
    ipdb = null

    // 1c. Thresholds:
    dt_threshold = 600
    de_threshold = 30
    ip_threshold = 60
    gene_fraction = "98%"
    binsize = 20
    subsample = 1
    help = null

    //=====2. Running parameters=====
    params.cpus = 16           /* Adjust as needed */
    params.memory = 40.GB     /* Adjust as needed */
}

Apptainer Users: Configure `runOptions`

If using Apptainer, make sure to bind your local SCALPEL repository path inside the container by editing the following block in nextflow.config:

apptainer {
    enabled = true
    autoMounts = true
    runOptions = "--bind /path/to/SCALPEL:/path/to/SCALPEL"
}

Adjust /path/to/SCALPEL to the full absolute path where the SCALPEL repository is located on your system.

Reference

Franz Ake, Marcel Schilling, Sandra M. Fernández-Moya, Akshay Jaya Ganesh, Ana Gutiérrez-Franco, Lei Li, Mireya Plass
Quantification of transcript isoforms at the single-cell level using SCALPEL
Nat Commun 16, 6402 (2025). doi://10.1038/s41467-025-61118-0

Contact

Franz AKE – @aerodx5 – franzarnold.ake@gmail.com
GitHub: https://github.com/p-CMRC-LAB/SCALPEL

Back to top

Name		Name	Last commit message	Last commit date
Latest commit History 170 Commits
datas		datas
docs		docs
src		src
workflows		workflows
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
requirements.yml		requirements.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SCALPEL: A Nextflow-based Pipeline for Isoform Quantification at Single-Cell Resolution

About the Project

Installation Options

Prerequisite:

Option 1: Using Conda environment

Option 2: Using Apptainer container

Required Input Files

Output Files and Execution Notes

Key Output Files

Notes

Customizing Execution with `nextflow.config`

Apptainer Users: Configure `runOptions`

Reference

Contact

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

plasslab/SCALPEL

Folders and files

Latest commit

History

Repository files navigation

SCALPEL: A Nextflow-based Pipeline for Isoform Quantification at Single-Cell Resolution

About the Project

Installation Options

Prerequisite:

Option 1: Using Conda environment

Option 2: Using Apptainer container

Required Input Files

Output Files and Execution Notes

Key Output Files

Notes

Customizing Execution with nextflow.config

Apptainer Users: Configure runOptions

Reference

Contact

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Customizing Execution with `nextflow.config`

Apptainer Users: Configure `runOptions`

Packages