-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Hi, I used Scanneo2 when I only have dna_normal and dna_tumor data, I changed the config like this:
Reference
General settings
reference:
release: 111
nonchr: false
threads: 30
mapq: 30 # overall required mapping quality
basequal: 20 # overall required base quality
data:
name: D1
dnaseq:
dna_normal: /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/DT2411609481-1/250210_SEQ081_FP500002421_L01_SP2501130808/FP500002421_L01_375_1.fq.gz /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/DT2411609481-1/250210_SEQ081_FP500002421_L01_SP2501130808/FP500002421_L01_375_2.fq.gz
dna_tumor1: /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/D2411320266/250210_SEQ082_FP500002422_L01_SP2501130799/FP500002422_L01_492_1.fq.gz /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/D2411320266/250210_SEQ082_FP500002422_L01_SP2501130799/FP500002422_L01_492_2.fq.gz
dna_tumor2: /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/D2411320262/250210_SEQ082_FP500002422_L01_SP2501130795/FP500002422_L01_488_1.fq.gz /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/D2411320262/250210_SEQ082_FP500002422_L01_SP2501130795/FP500002422_L01_488_2.fq.gz
rnaseq:
rna_tumor:
normal: dna_normal
custom:
variants:
hlatyping:
MHC-I:
MHC-II:
pre-processing (only applied on fastq reads)
preproc:
activate: true # whether (=true) or not (=false) to include pre-processing
minlen: 10
slidingwindow:
activate: true
wsize: 3
wqual: 20
alingment
align:
chimSegmentMin: 20
chimScoreMin: 10
chimJunctionOverhangMin: 10
chimScoreDropMax: 30
chimScoreSeparation: 10
variant calling
alternative splicing
altsplicing:
activate: true # whether (=true) or not (=false) to include alternative splicing events
confidence: 3 # confidence level (1,2 or 3) - filtering of input alignments
iterations: 5 # number of iteratios (when adding intro edges) - increases sensitivity
edgelimit: 250 # limit max number of edges in graph - affects the runtime
exitron splicing
exitronsplicing:
activate: true # whether (=true) or not (=false) to include exitron-splicing events
ao: 3 # allele observation
pso: 0.05 # percent spliced out
#strand: 1 # strand specificity of library (0=unstranded, 1=forward, 2=reverse)
strand: XS # strand specificity of library (0=XS, 1=RF, 2=FR)
gene fusion
genefusion:
activate: true # whether (=true) or not (=false) to include gene fusion events
maxevalue: 0.3
suppreads: 2 # all fusions with less than suppreads are discarded
maxsuppreads: 1000
maxidentity: 0.3 # genes with fraction of identity are discarded (homologs)
hpolymerlen: 6 # removes breakpoints adjacent to homopolymers of length
readthroughdist: 10000 # distance between breakpoints with less than distance
minanchorlen: 20 # removes fusions whose segments are less than minchimlen
splicedevents: 4 # fusions between genes need at least this many spliced breakpoints
maxkmer: 0.6 # remove reads with repetitive 3-mer that make up more than maxkmer
fraglen: 200 # mean fragment length
maxmismatch: 0.01
indel
indel:
activate: true # whether (=true) or not (=false) to include indels
type: all # long, short, all
mode: DNA # DNA, RNA or BOTH -
strategy for optimizing posterior probability threshold
strategy: OPTIMAL_F_SCORE # OPTIMAL_F_SCORE, FALSE_DISCOVERY_RATE, CONSTANT
fscorebeta: 1.0 # rel. weight of recall to precision (when OPTIMAL_F_SCORE is selected)
fdr: 0.05 # false discovery rate (when FALSE_DISCOVERY_RATE is selected)
sliplen: 8 # min number of reference bases to suspect slippage event
sliprate: 0.1 # frequency of slippage when it is supsected
quantification:
mode: DNA # RNA, RNA or BOTH
hlatyping:
class: BOTH # I, II or BOTH
specific path for class II hlatyping (only required when class: II, or BOTH)
MHC-I_mode: DNA # DNA, RNA, or custom (if empty alleles have to be specified in custom)
MHC-II_mode: DNA # DNA, RNA, or custom (if empty alleles have to be specified in custom)
specific path for class II hlatyping (only required when class: II, or BOTH)
freqdata: /hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/soft/hlahd.1.7.0/freq_data/
split: /hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/soft/hlahd.1.7.0/HLA_gene.split.txt
dict: /hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/soft/hlahd.1.7.0/dictionary/
prioritization:
class: I # I, II or BOTH
lengths:
MHC-I: 8,9,10,11
MHC-II: 13,14,15
And I got the error :
Config file /hsfscqjf1/ST_CQ/P24Z32300N0028/lvmeiqi/Project/1.ESCA/1.scanneo2/D1/config.yaml is extended by additional config specified via the command line.
Traceback (most recent call last):
File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/cli.py", line 1898, in args_to_api
dag_api = workflow_api.dag(
^^^^^^^^^^^^^^^^^
File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/api.py", line 326, in dag
return DAGApi(
^^^^^^^
File "", line 6, in init
File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/api.py", line 436, in post_init
self.workflow_api._workflow.dag_settings = self.dag_settings
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/api.py", line 383, in _workflow
workflow.include(
File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/workflow.py", line 1382, in include
exec(compile(code, snakefile.get_path_or_uri(), "exec"), self.globals)
File "/hsfscqjf1/ST_CQ/P24Z32300N0028/lvmeiqi/Project/1.ESCA/1.scanneo2/D1/workflow/Snakefile", line 27, in
include: "rules/custom.smk"
File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/workflow.py", line 1382, in include
exec(compile(code, snakefile.get_path_or_uri(), "exec"), self.globals)
File "/hsfscqjf1/ST_CQ/P24Z32300N0028/lvmeiqi/Project/1.ESCA/1.scanneo2/D1/workflow/rules/common.smk", line 126, in
config['data'] = data_structure(config['data'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/hsfscqjf1/ST_CQ/P24Z32300N0028/lvmeiqi/Project/1.ESCA/1.scanneo2/D1/workflow/rules/common.smk", line 11, in data_structure
config['data']['rnaseq'], filetype, readtype = handle_seqfiles(config['data']['rnaseq'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/hsfscqjf1/ST_CQ/P24Z32300N0028/lvmeiqi/Project/1.ESCA/1.scanneo2/D1/workflow/rules/common.smk", line 64, in handle_seqfiles
return mod_seqdata, filetype[0], readtype[0]
^^^^^^^^^^^^^^
IndexError: list index out of range
What should I do when I don't have rna data?