Skip to content

GTF parsing assumes gene_id and gene_name #33

@JosephLalli

Description

@JosephLalli

I'm encountering issues using a gtf file downloaded from NCBI. While the gtf specification is a nightmare, from what I can tell the NCBI believes that gene_id is required, but gene_name is not. They use the label "gene" for the equivalent piece of information.

Because g2gtools assumes any line without a gene name is invalid, it is unable to convert any of the NCBI gtfs I am providing.

Is it possible to remove the gene_name requirement? I understand it is useful when working with Ensembl gtfs, so maybe retain them as an optional piece of data to track.

(NCBI source: ncbi.nlm.nih.gov/genbank/genomes_gff)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions