Skip to content

Quantification of isoform usage and alternative polyadenylation (APA) from single-cell RNA-seq using a Nextflow-based pipeline.

License

Notifications You must be signed in to change notification settings

plasslab/SCALPEL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SCALPEL: A Nextflow-based Pipeline for Isoform Quantification at Single-Cell Resolution

SCALPEL

About the Project

SCALPEL is a robust pipeline designed for transcript isoform quantification and alternative polyadenylation (APA) characterization using 3'-tagged single-cell RNA-seq (scRNA-seq) data. Built with Nextflow, it integrates multiple processing steps including read quantification, APA annotation, and isoform usage analysis.

Installation Options

Prerequisite:

Nextflow v24.10.6 : Official page / CONDA

SCALPEL can be installed and run using one of the following options:

Option 1: Using Conda environment

  1. Clone the repository
git clone https://github.com/p-CMRC-LAB/SCALPEL.git
cd SCALPEL
  1. Create the Conda environment
conda env create -f requirements.yml
conda activate scalpel_conda
  1. Run SCALPEL within the environment
nextflow run -resume main.nf \
  --sequencing chromium \
  --samplesheet path/to/samplesheet.csv \
  --transcriptome path/to/gencode.transcripts.fa \
  --gtf path/to/gencode.annotation.gtf \
  --ipdb path/to/mm10.polyA.track \
  --barcodes path/to/barcodes.csv \
  --clusters path/to/clusters.txt

Option 2: Using Apptainer container

You can download a prebuilt Apptainer container with all SCALPEL dependencies from the following link:
Download SCALPEL Container

  1. Download the container and clone the repository
wget https://data.cyverse.org/dav-anon/iplant/home/franzx5/SCALPEL.container.sif
git clone https://github.com/p-CMRC-LAB/SCALPEL.git
cd SCALPEL
  1. Run SCALPEL using the container
nextflow run /path/to/SCALPEL/main.nf \
  -with-apptainer /path/to/scalpel_container.sif \
  --sequencing chromium \
  --samplesheet path/to/samplesheet.csv \
  --transcriptome path/to/gencode.transcripts.fa \
  --gtf path/to/gencode.annotation.gtf \
  --ipdb path/to/mm10.polyA.track

Required Input Files

Parameter Description
--samplesheet CSV with sample names and paths to FASTQ/BAM/CellRanger output
--transcriptome FASTA of reference transcriptome
--gtf GTF annotation file
--ipdb Internal priming annotation file
--barcodes (Optional) Barcode whitelist per sample
--clusters (Optional) Tab-delimited file with cell-to-cluster mappings
--sequencing Must be chromium or dropseq

Reference files:

Output Files and Execution Notes

After execution, SCALPEL generates a results/ directory containing key outputs for downstream analysis.

Key Output Files

File / Pattern Description
*_filtered.bam BAM files with deduplicated reads excluding internal priming artifacts.
*_filtered.bam.bai BAM index files.
*_APADGE.txt APA-aware isoform-level expression matrix per sample.
*_seurat.RDS Seurat object per sample.
iDGE_seurat.RDS Merged Seurat object across all samples.
DIU_table.csv Differential isoform usage table.
Runfiles/ Execution logs and process metadata.

Notes

  • Output filenames are prefixed by the sample name.
  • Seurat .RDS files are ready for downstream visualization and clustering in R.
  • *_APADGE.txt matrices are compatible with other statistical environments.

For downstream analysis tutorials, visit:


Customizing Execution with nextflow.config

To modify resource usage and process settings, edit the nextflow.config file. For example:

executor {
    name = 'slurm'               // Use 'local', 'slurm', etc.
    cpus = 64
}

process {
    withLabel: big_mem {
        cpus = 4
        memory = '8 GB'
    }
    withLabel: small_mem {
        cpus = 2
        memory = '2 GB'
    }
    // Additional process-specific settings...
}

Apptainer Users: Configure runOptions

If using Apptainer, make sure to bind your local SCALPEL repository path inside the container by editing the following block in nextflow.config:

apptainer {
    enabled = true
    autoMounts = true
    runOptions = "--bind /path/to/SCALPEL:/path/to/SCALPEL"
}

Adjust /path/to/SCALPEL to the full absolute path where the SCALPEL repository is located on your system.


Reference

Franz Ake, Sandra M. Fernández-Moya, Marcel Schilling, Akshay Jaya Ganesh, Ana Gutiérrez-Franco, Lei Li, Mireya Plass
Quantification of transcript isoforms at the single-cell level using SCALPEL
bioRxiv 2024.06.21.600022; https://doi.org/10.1101/2024.06.21.600022

Contact

Franz AKE – @aerodx5fake@idibell.cat
GitHub: https://github.com/p-CMRC-LAB/SCALPEL

Back to top

About

Quantification of isoform usage and alternative polyadenylation (APA) from single-cell RNA-seq using a Nextflow-based pipeline.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •