Releases · oushujun/LTR_retriever

25 Jun 19:20

oushujun

v3.0.4

afe4f59

v3.0.4 Latest

Latest

This update further enhances robustness for large genomes, streamlines overlap computations, and lays the groundwork for more scalable LTR discovery.

Major update

Reduced RepeatMasker memory footprint with run_RM_split.pl:

Splits large FASTA inputs into manageable chunks for RepeatMasker, masking them in parallel.
Skips already-masked chunks, then merges results into a single .masked file.
First run RepeatMasker on the full dataset, but automatically fall back to a chunked masking strategy when the primary call fails or yields no repeats, capping parallel jobs to avoid OOM-kills.

Rewrote bed_intersect_wao.pl

Simplified buffering logic: maintain only “active” B intervals in memory, purge by chromosome and start/end comparisons.
Eliminate circular-lookback logic in favor of a single pass with an in-memory buffer, supporting arbitrary chromosome orders.
Dynamically detect the number of columns in B to generate the correct “wao” dummy lines when no overlaps are found.
Speed is comparable to the original bedtools intersect -wao

Minor update

Refactored LTR.identifier.pl
Fixed a stray commented guard so that undefined scan entries are now properly skipped #193.

Assets 2

23 Jun 05:05

oushujun

v3.0.3

33d8970

v3.0.3

Major change

Introduce the -salvage [0|1] flag (default: 0) to recycle intermediate files and skip reruns when -salvage 1 is specified. This is particularly useful when processing large genomes (> 10 Gb) with limited walltime.

Reuse existing results in the Init, Major, and Trunc steps to skip reprocessing.
In the Major step, reuse TEsorter, HMM classification files, and processed candidates in the .defalse file.
Add new utility scripts under bin/:
- bed_intersect_wao.pl: bedtools-like intersect with ‘wao’ behavior and buffer
- filter_extend.fa_by_defalse.pl: filters extended FASTA by existing entries in the .defalse file
- filter_scn_by_defalse.pl: filters scn entries present in .defalse file

Minor change

Modify LTR.identifier.pl:
- Make necessary changes to implement the salvage mode.
- Add fallback for zero-length boundaries ($tot_len = $seq_len) to fix @EDTA#564

Full Changelog: v3.0.2...v3.0.3

Assets 2

24 Mar 04:00

oushujun

v3.0.2

7236376

v3.0.2

New features

Added the K2P and p-distance models for divergence and age estimations. Now K2P is the default model (#170 #184)

Enhancements

Improved parameters for timeout blastn to avoid stalling (#167)
Added new codes to search for solo LTRs and roughly intact LTR-RTs
Added a wrapper to calculate solo:intact LTR ratios from both LTR_retriever and EDTA results (@EDTA#279)

Bug fixed

Added a script to recreate the retriever.scn.adj from the .defalse file to avoid inconsistencies.

Full Changelog: v3.0.1...v3.0.2

Assets 2

16 Aug 20:58

oushujun

v3.0.1

0451a90

v3.0.1 release

New feature

Add the -stop parameter to stop the program after a user-specified step. For example, if you only want to obtain the .defalse and .pass.list files, you can stop the program after the Major filtering step (i.e., -stop major). By default, it will finish the full pipeline.

Assets 2

13 Aug 16:27

oushujun

v3.0.0

a24b84b

v3.0.0 update

Bug fix

Update get_range.pl: fix the sequnce ID recognition issue for LTRharvest inputs #177
Make sure candidates have sufficient flanking sequence to extend (50bp)

Assets 2

08 Jan 04:40

oushujun

v2.9.9

4039eb7

v2.9.9 update

New feature

Enable strand-aware outputs

For LTR candidates found in the negative strand, the locus presentation is now 5' -> 3', similar to candidates found in the positive strand. For example, Chr1:7890..3456 suggests the candidate is on the - strand. This information is shown in the first column of the pass.list, the last column of the gff3 file, and the sequence names of the intact.fa file. If the element is on the - strand, its sequence in the intact.fa file will be shown as 5' -> 3' from the negative strand. For example, Chr1:7890..3456's sequence will be a reverse complement to Chr1:3456..7890's sequence. For candidates without strand information (i.e., lack of coding sequence), their strangeness will be assumed positive for convenience.

Bug fix

Ensure candidates have sufficient flanking sequences to extend (default 50bp), which is necessary for LTR_retriever to determine whether the candidate is true or false. Candidates that can't satisfy this criterion will be skipped. Such a scenario is mostly likely found in fragmented genomes. Bug report: oushujun/EDTA#263

Assets 2

28 Dec 03:06

oushujun

v2.9.8

b746912

v2.9.8 update

New features

Use the same LTR name for parts of INT and LTR from the same element in preparation for solving @edta#251
Add the yml file for conda installation

Bug fix

Update get_range.pl

A bug introduced in Aug, 2023 (# a375c5e) that will output all candidates (both LTR retrotransposons and not LTR repeats) for generating the library file. You will see non-LTR sequences in the library due to this bug (eg., LTR/EnSpm-CACTA). Now it's fixed.
A bug introduced in May, 2023 (#058ce29) that fails to remove masked sequences in the final library. Now it's fixed.
Remove the RepeatMasker support to simplify the code since this functionality is never used in the official release.

Contributors

edta

Assets 2

11 Jul 05:27

oushujun

v2.9.5

84ca5fc

Bug fix

Fix bug #153 in v2.9.4 when introducing TEsorter to classify LTR candidates.

Assets 2

08 May 20:03

oushujun

v2.9.4

058ce29

It just gets better with community efforts!

Major Updates

Add TEsorter to help to identify not LTR sequences. Candidate LTRs will be determined as "false" if they contain not-LTR HMM profile matches even the candidate contains LTR/TSD and the TGCA motif. This purging will remove a small number of structurally intact LTR candidates (5/2304 in rice). This implementation offers slight improvements over older versions and should be more significant for larger genomes.

LTR_retriever-harvest_FINDER	sens	spec	accu	prec	FDR	F1
retriever_v2.5	0.967	0.920	0.931	0.789	0.211	0.869
retriever_v2.6	0.963	0.931	0.939	0.811	0.189	0.881
retriever_v2.9.2	0.966	0.926	0.935	0.802	0.198	0.876
retriever_v2.9.4	0.967	0.928	0.937	0.804	0.196	0.878

Add more filtering parameters to identify solo LTRs, improve the solo-intact ratio calculation (#111, #110).
Resolve RMblast errors when it attempts to overutilize CPUs #137

Other improvements

Now require sequence IDs for 13 characters or less to accomodate for huge chromosomes up to 999Mb in length.
Add missing TRF parameter (#133)
Add check to ensure the input genome is writable (LTR_retriever won't overwrite your genome) (#125).
Remove gap length for genome size calculation.

Acknowledgements

Andreas Wallberg, @Shokusei, Evan Ernst, @xie-wei-hh, @with9, and users like YOU!

Contributors

with9, xie-wei-hh, and Shokusei

Assets 2

28 Jul 18:18

oushujun

v2.9.0

0c4d1fa

Version 2.9.0: Polishing outputs

Major updates

This version has many improvements in the downstream outputs including:

standardized the GFF3 output following these criteria and used the updated TE-related sequence ontologies
combined structural and homological LTR annotations. Homology-based LTR fragments will be replaced by structural-based LTR annotations wherever applicable.

Other improvements

allow users to provide paths to dependencies in the command-line.
updated readme
fixed a number of minor bugs.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Major update

Minor update

Uh oh!

Major change

Minor change

Uh oh!

New features

Enhancements

Bug fixed

Uh oh!

New feature

Uh oh!

Bug fix

Uh oh!

New feature

Enable strand-aware outputs

Bug fix

Uh oh!

New features

Bug fix

Contributors

Uh oh!

Uh oh!

Major Updates

Other improvements

Acknowledgements

Contributors

Uh oh!

Major updates

Other improvements

Uh oh!

Releases: oushujun/LTR_retriever

v3.0.4

Major update

Minor update

Uh oh!

v3.0.3

Major change

Minor change

Uh oh!

v3.0.2

New features

Enhancements

Bug fixed

Uh oh!

v3.0.1 release

New feature

Uh oh!

v3.0.0 update

Bug fix

Uh oh!

v2.9.9 update

New feature

Enable strand-aware outputs

Bug fix

Uh oh!

v2.9.8 update

New features

Bug fix

Contributors

Uh oh!

Bug fix

Uh oh!

It just gets better with community efforts!

Major Updates

Other improvements

Acknowledgements

Contributors

Uh oh!

Version 2.9.0: Polishing outputs

Major updates

Other improvements

Uh oh!