Skip to content

Classification of ecDNAs #61

@silkrp

Description

@silkrp

Hi, following up on our discussion in #59, I’ve been reviewing the CoRAL output for one of our samples (E25) and was wondering which files and fields should be used to identify the ecDNA predicted by CoRAL.

In the *.bed file output by CoRAL, it appears to suggests that CoRAL detected two ecDNAs (cycle_id 1 and cycle_id 2), both marked as iscyclic = True. This is assuming that when iscyclic is TRUE it refers to an ecDNA amplicon, and that each cycle_id refers to a different ecDNA amplicon. The file looks as follows:

#chr    start   end     orientation     cycle_id        iscyclic        weight
chr12   68289411        68296626        +       1       True    280.03069256391854
chr12   68376688        68418711        -       1       True    280.03069256391854
chr12   68683243        69112846        -       1       True    280.03069256391854
chr4    53542873        53546479        +       2       True    35.20001540165041
chr12   69116164        69210212        +       2       True    35.20001540165041
chr12   57732353        57759576        +       2       True    35.20001540165041
chr12   67440960        67653722        +       2       True    35.20001540165041
chr12   68360060        68376613        +       2       True    35.20001540165041
chr12   69112911        69210212        +       2       True    35.20001540165041
chr12   57732353        58010216        +       2       True    35.20001540165041
chr4    55226054        55237799        +       2       True    35.20001540165041

However, the corresponding amplicon_summary.txt for this sample only reports details for a single amplicon (AmpliconID = 1). Which looks to be cycle_id 2 in the previous bed file. This file looks as follows:

CoRAL v2.2.0
1/1 amplicons solved.
Runtime Limit: 21600 s
Profiling Enabled: False
-----------------------------------------------
AmpliconID = 1
#Intervals = 5
AmpliconIntervals:
        Amplicon0>chr4:53,438,912-54,728,886
        Amplicon0>chr4:54,923,878-55,499,030
        Amplicon0>chr12:57,630,264-58,121,487
        Amplicon0>chr12:64,120,264-64,851,857
        Amplicon0>chr12:67,340,269-69,310,269
Total Amplicon Size: 5,057,947
# Chromosomes: 2
# Sequence Edges: 57
# Concordant Edges: 52
# Discordant Edges: 26
# Non-Source Edges: 135
# Source Edges: 0
Cycle Decomposition Status: SUCCESS
        ModelMetadata: GREEDY, k=9, alpha=0.01, total_weights=195158103.68227065, resolution=0.1, num_path_constraints=21
        Heaviest graph walk solved was a cycle.
        Cycle=1;Copy_count=280.03069256391854;Segments=37+,41-,53-,52-,51-,50-,49-,48-,47-,46-,45-,44-,43-

Does the *.bed file indicate two distinct ecDNAs, or is one of the cycles (cycle_id 1) later filtered out as being an ecDNA at a later stage?

Any clarification on how to interpret multiple cycle_id's in the BED file versus the summary output would be much appreciated, and ideally which CoRAL files contain the final set of ecDNA predictions?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions