-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hi, following up on our discussion in #59, I’ve been reviewing the CoRAL output for one of our samples (E25) and was wondering which files and fields should be used to identify the ecDNA predicted by CoRAL.
In the *.bed
file output by CoRAL, it appears to suggests that CoRAL detected two ecDNAs (cycle_id
1 and cycle_id
2), both marked as iscyclic
= True. This is assuming that when iscyclic
is TRUE it refers to an ecDNA amplicon, and that each cycle_id
refers to a different ecDNA amplicon. The file looks as follows:
#chr start end orientation cycle_id iscyclic weight
chr12 68289411 68296626 + 1 True 280.03069256391854
chr12 68376688 68418711 - 1 True 280.03069256391854
chr12 68683243 69112846 - 1 True 280.03069256391854
chr4 53542873 53546479 + 2 True 35.20001540165041
chr12 69116164 69210212 + 2 True 35.20001540165041
chr12 57732353 57759576 + 2 True 35.20001540165041
chr12 67440960 67653722 + 2 True 35.20001540165041
chr12 68360060 68376613 + 2 True 35.20001540165041
chr12 69112911 69210212 + 2 True 35.20001540165041
chr12 57732353 58010216 + 2 True 35.20001540165041
chr4 55226054 55237799 + 2 True 35.20001540165041
However, the corresponding amplicon_summary.txt
for this sample only reports details for a single amplicon (AmpliconID = 1). Which looks to be cycle_id
2 in the previous bed file. This file looks as follows:
CoRAL v2.2.0
1/1 amplicons solved.
Runtime Limit: 21600 s
Profiling Enabled: False
-----------------------------------------------
AmpliconID = 1
#Intervals = 5
AmpliconIntervals:
Amplicon0>chr4:53,438,912-54,728,886
Amplicon0>chr4:54,923,878-55,499,030
Amplicon0>chr12:57,630,264-58,121,487
Amplicon0>chr12:64,120,264-64,851,857
Amplicon0>chr12:67,340,269-69,310,269
Total Amplicon Size: 5,057,947
# Chromosomes: 2
# Sequence Edges: 57
# Concordant Edges: 52
# Discordant Edges: 26
# Non-Source Edges: 135
# Source Edges: 0
Cycle Decomposition Status: SUCCESS
ModelMetadata: GREEDY, k=9, alpha=0.01, total_weights=195158103.68227065, resolution=0.1, num_path_constraints=21
Heaviest graph walk solved was a cycle.
Cycle=1;Copy_count=280.03069256391854;Segments=37+,41-,53-,52-,51-,50-,49-,48-,47-,46-,45-,44-,43-
Does the *.bed
file indicate two distinct ecDNAs, or is one of the cycles (cycle_id
1) later filtered out as being an ecDNA at a later stage?
Any clarification on how to interpret multiple cycle_id
's in the BED file versus the summary output would be much appreciated, and ideally which CoRAL files contain the final set of ecDNA predictions?