Skip to content

Ignore Ns that might have slipped through when updating PRGs #58

@leoisl

Description

@leoisl

When running the 4-way pipeline, updating the E coli PRG with illumina data, I got this error:

Traceback (most recent call last):
  File "/usr/local/bin/make_prg", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/make_prg/__main__.py", line 94, in main
    args.func(args)
  File "/usr/local/lib/python3.9/site-packages/make_prg/subcommands/update.py", line 182, in run
    denovo_variants_db = DenovoVariantsDB(
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 548, in __init__
    locus_name_to_denovo_loci = self._get_locus_name_to_denovo_loci()
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 532, in _get_locus_name_to_denovo_loci
    return self._get_locus_name_to_denovo_loci_core(filehandler)
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 522, in _get_locus_name_to_denovo_loci_core
    variants = self._read_variants(
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 495, in _read_variants
    denovo_variant = cls._read_DenovoVariant(
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 477, in _read_DenovoVariant
    denovo_variant = DenovoVariant(
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 41, in __init__
    DenovoVariant._param_checking(
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 59, in _param_checking
    DenovoVariant._check_sequence_is_composed_of_ACGT_only(alt)
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 85, in _check_sequence_is_composed_of_ACGT_only
    raise DenovoError(f"Found a non-ACGT seq ({seq}) in a denovo variant")
make_prg.update.denovo_variants.DenovoError: Found a non-ACGT seq (N) in a denovo variant

There are 36416 new variants found, and only a single one has N in it. I'd very much prefer to simply issue a warning here:

raise DenovoError(f"Found a non-ACGT seq ({seq}) in a denovo variant")
than erroring out and not being able to update

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions