Skip to content

Introduce LAI to LTR_retriever for measurement of genome assembly continuity

Pre-release
Pre-release
Compare
Choose a tag to compare
@oushujun oushujun released this 07 Jan 18:03
· 229 commits to master since this release

New feature: The LTR-RT Assembly Index (LAI) for evaluation of genome assembly continuity
Description: LTR retrotransposon is very difficult to assemble due to their repetitive nature (up to 75% of a genome, i.e., maize) and long length (up to 20 Kb long). A very simple idea that more intact LTR-RT could be found in the more continuous genome provides the theoretical support of LAI. This module is using the list of intact LTR-RT and the whole-genome annotation of LTR-RT produced by LTR_retriever (*.pass.list and *.out, respectively) for calculation of LAI. A window-based calculation is implemented for estimation of regional continuity. A manuscript describing this feature is in preparation.

Other feathers:

  1. improved purging criteria. Introduce the identity cutoff for alignment hits (>=30%), change the alignment length criteria to the identity-length criteria: identity-length = alignment length - mismatch >=90 for a real hit.
  2. add scripts to identify solo LTR and complete LTR, and to estimate solo-complete ratio for each family, and count family size in the genome. These codes were initially developed for this study: https://www.nature.com/articles/s41467-017-02546-5
  3. Control the length of internal regions (>=100 bp) on LTR candidates.
  4. Updated the manual