Skip to content

Difference monomers count between all.repeats.csv and fasta_peaks.csv #34

@GLagunas-Robles

Description

@GLagunas-Robles

Hello,

Thanks again for this awesome tool! I have a question about correctly calculating the number of monomers present in the output.

My question pertains to a difference in number repeats reported in the plot/*fasta_peaks.csv and the one I calculated using this R script. I think the repeats are being counted using the summary of repetitive regions, but the circos plot seems to be using the all repeats file because I see some regions that are being plotted that are not present in the summary file. Any advice on which file to use would be great!

German

df_summary <- all.repeats.fasta.csv %>%
  group_by(width) %>%
  summarise(repeat_count = n(), .groups = "drop") %>%
  mutate(total_length = count * width)

These are my results from my calculation from the all repeats file

width| repeat_count| total length
199 | 3584 | 713216
148 | 2391 | 353868

These are the results from the csv output in the plot directory

repeat_length | no_repeats | total_length
199 | 7166 | 1426678
148 | 3589 | 531108
230 | 1695 | 389863

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions