Skip to content

queries fail for some uniprot accessions #128

@ftwkoopmans

Description

@ftwkoopmans

Some uniprot accessions are not available for querying nor as output in the "uniprot" field/scope. To illustrate I've included 2 examples, one accession that works (P63044) and one that fails (P23819).

this works via https://mygene.info/v3/api#/query/get_query ;
"q" input: P63044
"fields" input: symbol,name,taxid,entrezgene,uniprot

returns:

{
  "took": 16,
  "total": 1,
  "max_score": 17.406927,
  "hits": [
    {
      "_id": "22318",
      "_score": 17.406927,
      "entrezgene": "22318",
      "name": "vesicle-associated membrane protein 2",
      "symbol": "Vamp2",
      "taxid": 10090,
      "uniprot": {
        "Swiss-Prot": "P63044",
        "TrEMBL": "Q8CHR4"
      }
    }
  ]
}

this works via https://mygene.info/v3/api#/query/get_query ;
in "q" input: P23819
in "fields" input: symbol,name,taxid,entrezgene,uniprot

and returns:

{
  "took": 13,
  "total": 1,
  "max_score": 7.8478303,
  "hits": [
    {
      "_id": "14800",
      "_score": 7.8478303,
      "entrezgene": "14800",
      "name": "glutamate receptor, ionotropic, AMPA2 (alpha 2)",
      "symbol": "Gria2",
      "taxid": 10090,
      "uniprot": {
        "TrEMBL": "Q4LG64"
      }
    }
  ]
}

However, note that for the latter query, the uniprot input ID that I queried (a swissprot record) is not included in the "uniprot" output field! So it seems there is a problem with the mygene.info database, possibly a subset of uniprot accessions/IDs are not stored/linked under "uniprot". Other examples are P23819, Q61941, Q8VHW2.

Furthermore, POST queries against these accessions fail even though they should not (probably same root cause).

this works via https://mygene.info/v3/api#/query/post_query ;
{ "q": "P63044", "scopes": "uniprot" }
returns:

[
  {
    "query": "P63044",
    "_id": "22318",
    "_score": 16.7524,
    "entrezgene": "22318",
    "name": "vesicle-associated membrane protein 2",
    "symbol": "Vamp2",
    "taxid": 10090
  }
]

this query fails, but it should not as this is a valid uniprot accesion that is in the mygene.info dataset (see GET query above) ;
{ "q": "P23819", "scopes": "uniprot" }
returns:

[
  {
    "query": "P23819",
    "notfound": true
  }
]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions