Skip to content

aDNA-trim produces invalid merged FASTQ files in some cases #3

@shyama-mama

Description

@shyama-mama

Hi Guys,

I am using aDNA trim as follows:
seqtk mergepe R1.fastq R2.fastq | adna-trim -p aDNA_trim_pe - > aDNA_trim_merged.fastq
The data is a NovaSeq sample pre-processed with FasP to trim polyG tails.

This is the original read
R1

@A00488:28:HJ3THDSXX:2:1146:24189:25441 1:N:0:ACCAACT
TCCAGAGTTATTGCTGTGATACAGGCAGAGATGCTATAACTGAGTTTGTATTCTAGGGGGGGGGGGCCGATGTTAACGGGGAAAAGATAAAAATTTAACTTAATTGATACAGTGATATTAAATACGGACGAGCACACGACTAAC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFF:,,,,:F,F,:,FF,:FF,,:,F,,FF:,,F,,FF,,:F,:,,,,F,,::,,,F:,,:FF,F,,F,F,F,F,F,::F,

R2

@A00488:28:HJ3THDSXX:2:1146:24189:25441 2:N:0:ACCAACT
CCTGTATCACTAAAGTTACATTATTATCTTTTCCCTGTTAACGTCGGGGGGGGCGGGGGGGGTGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
+
FFFFFFFFFFFFFFFF,,FFFFFFF,FFFFFFFFF,F,,,,:,,,,F,F,:,,,,,,::FFF,::,,,:FF::,:F:FFF:,:FF:FF,FFF:,F:F,F,,::,,,FFF,FF,::,:,:,F:,FFFF,FF,,:F:FFF:F:FF:

Using aDNA trim on the fastq directly does not merge the reads. So I ran FastP to trim the PolyG tail. This is the modified fastq
R1

@A00488:28:HJ3THDSXX:2:1146:24189:25441 1:N:0:ACCAACT
TCCAGAGTTATTGCTGTGATACAGGCAGAGATGCTATAACTGAGTTTGTATTCTAGGGGGGGGGGGCCGATGTTAACGGGGAAAAGATAAAAATTTAACTTAATTGATACAGTGATATTAAATACGGACGAGCACACGACTAAC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFF:,,,,:F,F,:,FF,:FF,,:,F,,FF:,,F,,FF,,:F,:,,,,F,,::,,,F:,,:FF,F,,F,F,F,F,F,::F,

R2

@A00488:28:HJ3THDSXX:2:1146:24189:25441 2:N:0:ACCAACT
CCTGTATCACTAAAGTTACATTATTATCTTTTCCCTGTTAAC
+
FFFFFFFFFFFFFFFF,,FFFFFFF,FFFFFFFFF,F,,,,:

Using aDNA-trim produces the following invalid read.

@A00488:28:HJ3THDSXX:2:1146:24189:25441_22:*
,F,::F,
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFF:,,,,:F,F,F!FFFFFFFFF,FF;FFF;,FFFFF;F-FFFFFFFF;FFFFFFFFFFF:FFFFFFFFFFFF:,,,,:F,F,:,FF,:FF,,:,F,,FF:,,F,,FF,,:F,:,,,,F,,::,,,F:,,:FF,F,,F,F,F,F,F,::F,

The read should be able to be merged to produce a valid read. See FastP read below:

@A00488:28:HJ3THDSXX:2:1146:24189:25441 1:N:0:ACCAACT merged_113_0
TCCAGAGTTATTGCTGTGATACAGGCAGAGATGCTATAACTGAGTTTGTATTCTAGGGGGGGGGGGCCGATGTTAACGGGGAAAAGATAATAATGTAACTTTATTGATACAGG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFF:,,,,:F,F,:,FF,:FF,,:,F,FFF:F,F,,FFF,:F,:,,,,FF

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions