How can I fix config file when I only have DNA data?

Hi, I used Scanneo2 when I only have dna_normal and dna_tumor data, I changed the config like this:
### Reference

# General settings
reference:
  release: 111
  nonchr: false
threads: 30
mapq: 30  # overall required mapping quality
basequal: 20  # overall required base quality

data:
  name:  D1
  dnaseq:
    dna_normal: /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/DT2411609481-1/250210_SEQ081_FP500002421_L01_SP2501130808/FP500002421_L01_375_1.fq.gz /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/DT2411609481-1/250210_SEQ081_FP500002421_L01_SP2501130808/FP500002421_L01_375_2.fq.gz
    dna_tumor1: /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/D2411320266/250210_SEQ082_FP500002422_L01_SP2501130799/FP500002422_L01_492_1.fq.gz /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/D2411320266/250210_SEQ082_FP500002422_L01_SP2501130799/FP500002422_L01_492_2.fq.gz
    dna_tumor2: /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/D2411320262/250210_SEQ082_FP500002422_L01_SP2501130795/FP500002422_L01_488_1.fq.gz /hsfscqjf3/DIPSEQ/zfssz8/CNGB_DATA/BGISEQ01/DIPSEQ/DIPSEQT20/P24Z10200N0995_Temp/D2411320262/250210_SEQ082_FP500002422_L01_SP2501130795/FP500002422_L01_488_2.fq.gz
  rnaseq:
    rna_tumor:
  normal: dna_normal

  custom:
    variants:
    hlatyping:
      MHC-I:
      MHC-II:

### pre-processing (only applied on fastq reads)
preproc:
  activate: true  # whether (=true) or not (=false) to include pre-processing
  minlen: 10
  slidingwindow:
    activate: true
    wsize: 3
    wqual: 20

### alingment
align:
  chimSegmentMin: 20
  chimScoreMin: 10
  chimJunctionOverhangMin: 10
  chimScoreDropMax: 30
  chimScoreSeparation: 10

### variant calling
# alternative splicing
altsplicing:
  activate: true # whether (=true) or not (=false) to include alternative splicing events
  confidence: 3  # confidence level (1,2 or 3) - filtering of input alignments
  iterations: 5 # number of iteratios (when adding intro edges) - increases sensitivity
  edgelimit: 250  # limit max number of edges in graph - affects the runtime

# exitron splicing
exitronsplicing:
  activate: true # whether (=true) or not (=false) to include exitron-splicing events
  ao: 3  # allele observation
  pso: 0.05  # percent spliced out
  #strand: 1 # strand specificity of library (0=unstranded, 1=forward, 2=reverse)
  strand: XS # strand specificity of library (0=XS, 1=RF, 2=FR)

# gene fusion
genefusion:
  activate: true # whether (=true) or not (=false) to include gene fusion events
  maxevalue: 0.3
  suppreads: 2  # all fusions with less than suppreads are discarded
  maxsuppreads: 1000
  maxidentity: 0.3  # genes with fraction of identity are discarded (homologs)
  hpolymerlen: 6  # removes breakpoints adjacent to homopolymers of length
  readthroughdist: 10000  # distance between breakpoints with less than distance
  minanchorlen: 20  # removes fusions whose segments are less than minchimlen
  splicedevents: 4  # fusions between genes need at least this many spliced breakpoints
  maxkmer: 0.6  # remove reads with repetitive 3-mer that make up more than maxkmer
  fraglen: 200 # mean fragment length
  maxmismatch: 0.01

### indel
indel:
  activate: true # whether (=true) or not (=false) to include indels
  type: all # long, short, all
  mode: DNA  # DNA, RNA or BOTH -
  # strategy for optimizing posterior probability threshold
  strategy: OPTIMAL_F_SCORE # OPTIMAL_F_SCORE, FALSE_DISCOVERY_RATE, CONSTANT
  fscorebeta: 1.0  # rel. weight of recall to precision (when OPTIMAL_F_SCORE is selected)
  fdr: 0.05  # false discovery rate (when FALSE_DISCOVERY_RATE is selected)
  sliplen: 8  # min number of reference bases to suspect slippage event
  sliprate: 0.1  # frequency of slippage when it is supsected

quantification:
  mode: DNA # RNA, RNA or BOTH

hlatyping:
  class: BOTH # I, II or BOTH
  # specific path for class II hlatyping (only required when class: II, or BOTH)
  MHC-I_mode: DNA # DNA, RNA, or custom (if empty alleles have to be specified in custom)
  MHC-II_mode: DNA # DNA, RNA, or custom (if empty alleles have to be specified in custom)

  # specific path for class II hlatyping (only required when class: II, or BOTH)
  freqdata: /hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/soft/hlahd.1.7.0/freq_data/
  split: /hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/soft/hlahd.1.7.0/HLA_gene.split.txt
  dict: /hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/soft/hlahd.1.7.0/dictionary/

prioritization:
  class: I # I, II or BOTH
  lengths:
    MHC-I: 8,9,10,11
    MHC-II: 13,14,15


And I got the error :
Config file /hsfscqjf1/ST_CQ/P24Z32300N0028/lvmeiqi/Project/1.ESCA/1.scanneo2/D1/config.yaml is extended by additional config specified via the command line.
Traceback (most recent call last):
  File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/cli.py", line 1898, in args_to_api
    dag_api = workflow_api.dag(
              ^^^^^^^^^^^^^^^^^
  File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/api.py", line 326, in dag
    return DAGApi(
           ^^^^^^^
  File "<string>", line 6, in __init__
  File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/api.py", line 436, in __post_init__
    self.workflow_api._workflow.dag_settings = self.dag_settings
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/api.py", line 383, in _workflow
    workflow.include(
  File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/workflow.py", line 1382, in include
    exec(compile(code, snakefile.get_path_or_uri(), "exec"), self.globals)
  File "/hsfscqjf1/ST_CQ/P24Z32300N0028/lvmeiqi/Project/1.ESCA/1.scanneo2/D1/workflow/Snakefile", line 27, in <module>
    include: "rules/custom.smk"
  File "/hsfscqjf1/ST_CQ/P23Z32300N0005/lvmeiqi/software/miniconda3/envs/scanneo2/lib/python3.12/site-packages/snakemake/workflow.py", line 1382, in include
    exec(compile(code, snakefile.get_path_or_uri(), "exec"), self.globals)
  File "/hsfscqjf1/ST_CQ/P24Z32300N0028/lvmeiqi/Project/1.ESCA/1.scanneo2/D1/workflow/rules/common.smk", line 126, in <module>
    config['data'] = data_structure(config['data'])
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/hsfscqjf1/ST_CQ/P24Z32300N0028/lvmeiqi/Project/1.ESCA/1.scanneo2/D1/workflow/rules/common.smk", line 11, in data_structure
    config['data']['rnaseq'], filetype, readtype  = handle_seqfiles(config['data']['rnaseq'])
                                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/hsfscqjf1/ST_CQ/P24Z32300N0028/lvmeiqi/Project/1.ESCA/1.scanneo2/D1/workflow/rules/common.smk", line 64, in handle_seqfiles
    return mod_seqdata, filetype[0], readtype[0]
                         ^^^^^^^^^^^^^^
IndexError: list index out of range

What should I do when I don't have rna data?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How can I fix config file when I only have DNA data? #42

Reference

General settings

pre-processing (only applied on fastq reads)

alingment

variant calling

alternative splicing

exitron splicing

gene fusion

indel

strategy for optimizing posterior probability threshold

specific path for class II hlatyping (only required when class: II, or BOTH)

specific path for class II hlatyping (only required when class: II, or BOTH)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How can I fix config file when I only have DNA data? #42

Description

Reference

General settings

pre-processing (only applied on fastq reads)

alingment

variant calling

alternative splicing

exitron splicing

gene fusion

indel

strategy for optimizing posterior probability threshold

specific path for class II hlatyping (only required when class: II, or BOTH)

specific path for class II hlatyping (only required when class: II, or BOTH)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions