Skip to content

biocommons/anyvar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AnyVar

AnyVar provides Python and REST interfaces to validate, normalize, generate identifiers, and register biological sequence variation according to the GA4GH Variation Representation Specification (VRS).

Information

license

Development

issues GitHub Open Pull Requests GitHub Contributors GitHub stars GitHub forks

Known Issues

You are encouraged to browse issues. All known issues are listed there. Please report any issues you find.

Quick Start

  1. Clone the AnyVar repository:

    git clone https://github.com/biocommons/anyvar
    cd anyvar
  2. If desired, create/start a database for AnyVar to use. See Optional Dependencies - Databases for detailed instructions on how to set up a database. Otherwise, set the ANYVAR_STORAGE_URI environment variable to null to use AnyVar without a database.

  3. Configure required dependencies:

    AnyVar has several required dependencies and a few optional ones. See Setting up Dependencies for detailed instructions.

  4. Start the AnyVar server:

    uvicorn anyvar.restapi.main:app --reload
  5. Visit http://localhost:8000 to verify the REST API is running.

Setting up Dependencies

Required Dependencies

SeqRepo

SeqRepo stores biological sequence data and can be accessed locally or via REST API. Read how to set up SeqRepo locally or through Docker.

UTA

UTA (Universal Transcript Archive) stores transcripts aligned to sequence references. Read how to set up UTA locally or via Docker.

Optional Dependencies - Databases

AnyVar optionally supports several storage types. For general information about AnyVar's SQL storage options, see the SQL Storage documentation. See below for more specific details on the various storage implementation options.

It is also possible to run AnyVar with no database. This is primarily useful for bulk annotations, such as annotating a VCF, where there is no real need to reuse previously computed VRS IDs. To run AnyVar with no database, set the ANYVAR_STORAGE_URI environment variable to null.

PostgreSQL (Optional)

AnyVar supports PostgreSQL databases. Configure PostgreSQL for AnyVar.

Snowflake (Optional)

AnyVar can also utilize Snowflake. Detailed instructions available.

Asynchronous Operations

AnyVar supports asynchronous VCF annotation for improved scalability. See asynchronous operations README.

Developers

This section is intended for developers who contribute to AnyVar.

Prerequisites

  • Python >= 3.11
    • _Note: Python 3.11 is required for developers contributing to AnyVar
  • Docker

Installing for development

git clone https://github.com/biocommons/anyvar.git
cd anyvar
make devready
source venv/3.11/bin/activate
pre-commit install

Testing

Run tests:

  1. Set up a database for testing. The default is a postgres database, which you can set up by following the instructions found here: src/docs/postgres.md.

  2. Follow the quickstart guide to get AnyVar running

  3. If you haven't run make devready before, open a new terminal and do so now. Then, source your venv by running: source venv/3.11/bin/activate

    Otherwise, you can skip straight to sourcing your venv: source venv/3.11/bin/activate

  4. Within your venv, run make testready if you've never done so before. Otherwise, skip this step.

  5. Within your venv, export the following environment variables. (Note: if you ever deactivate your venv, you'll need to export all of these again)

    • SEQREPO_DATAPROXY_URI - See the quickstart guide above.
    • ANYVAR_STORAGE_URI - See the quickstart guide above.
    • ANYVAR_TEST_STORAGE_URI - This specifies the database to use for tests. If you set up a postgres database by following the README-pg guide suggested in step 1, then you can just copy/paste the example ANYVAR_TEST_STORAGE_URI found below.

    For example:

    export ANYVAR_TEST_STORAGE_URI=postgresql://postgres:postgres@localhost/anyvar_test
    export ANYVAR_STORAGE_URI=postgresql://anyvar:anyvar-pw@localhost:5432/anyvar
    export SEQREPO_DATAPROXY_URI=seqrepo+file:///usr/local/share/seqrepo/latest
  6. Finally, run tests with the following command:

    make test

Notes

Currently, there is some interdependency between test modules -- namely, tests that rely on reading data from storage assume that the data from test_variation has been uploaded. A pytest hook ensures correct test order, but some test modules may not be able to pass when run in isolation. By default, the tests will use a Postgres database installation. To run the tests against a Snowflake database, change the ANYVAR_TEST_STORAGE_URI to a Snowflake URI and run the tests.

For the tests/test_vcf::test_vcf_registration_async unit test to pass, a real broker and backend are required for Celery to interact with. Set the CELERY_BROKER_URL and CELERY_BACKEND_URL environment variables. The simplest solution is to run Redis locally and use that for both the broker and the backend, eg:

% export CELERY_BROKER_URL="redis://"
% export CELERY_BACKEND_URL="redis://"

Logging

AnyVar uses Python's built-in logging. To customize logging settings see docs/logging.md

About

[in development] Proof-of-Concept variation translation, validation, and registration service

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published