Skip to content

LapDevelopment_Local

StephanOepen edited this page Jan 22, 2016 · 20 revisions

Download Lap

Checkout the lap source code from http://svn.emmtee.net/lap/trunk to somewhere sensible. If your setup is different, some things will necessarily have to change.

Bash setup

Add the following to your .bashrc and re-source it (or start a fresh shell):

export LAPTREE=/path/to/lap/tree
export LAPLIBRARY=/path/to/lap/library

### Activate this if you want to run galaxy in lappython only
#. $LAPTREE/etc/dot.bashrc

### Address to MongoDB to the local host 
export LAPSTORE=mongodb://127.0.0.1:27017/lapstore

MongoDB

Some notes on how MongoDB is configured on the LAP servers is available on the LapDevelopment/MongoDB.

  • Install MongoDB via your package manager (or from source, whatever floats your boat); on Ubuntu, the package is simply called mongodb.

  • Create a directory for MongoDB's files. I use ~/mongodb

  • Start the server: mongod --dbpath ~/mongodb. If disk space is at a premium (by default, MongoDB wants to create 3GB of stuff initially) the --smallfiles option is your friend.

  • You can start the server as a demon with the following command: mongod --fork --logpath mongodb.log --dbpath ~/mongodb

Galaxy

We also need a clean Galaxy, as the production instance has some changes to make things work nice with Abel and such.

This assumes you install galaxy side-by-side with the production instance (that is, in the root of the SVN checkout). If you want something else, the file manipulation commands will necessarily have to be different.

  • Check out the appropriate revision of Galaxy: hg clone -r 5c789ab4144a http://bitbucket.org/galaxy/galaxy-dist

  • Copy the tool config from the production instance to your checkout: cp trunk/development/galaxy/tool_conf.xml* galaxy-dist/

  • Remove the default tools: rm -r galaxy-dist/tools

  • Symlink in the LAP tools: ln -s trunk/tools galaxy-dist/tools

  • In the galaxy-dist directory, run the file run.sh

  git clone https://github.com/galaxyproject/galaxy/
  cd galaxy
  git checkout release_15.03

On Debian Sid the first run fails with the following message:

WebError 0.8a couldn't be downloaded automatically.  You can try
building it by hand with:
  python scripts/scramble.py -e WebError
Fetch failed.
  • Run the indicated command python scripts/scramble.py -e WebError

  • Run run.sh again

On Ubuntu 14.04, the first run of run.sh fails when downloading eggs. This seems to be a version conflict between the system Python's version of some library and what Galaxy wants. It can be fixed by doing the first invocation in a virtualenv:

  • Make sure virtualenv is installed: sudo apt-get install python-virtualenv

  • Set up a virtualenv: virtualenv --no-site-packages galaxy_env

  • Activate it: . galaxy_env/bin/activate

  • Run run.sh again

The server should now start, and subsequent runs should not require the virtualenv.

ToDo And what about our custom data types (oe; 14-jan-16)?

Test Suite

Relevant parts of the repository:

trunk/library/python/lap/test.py
/home/emanuel/work/lap/trunk/tree/tests/function/{eng.t|eng.txt|...}

Before committing changes, developers must make sure that all tests pass. To run all tests, from the top level trunk directory, run:

make

Each test in trunk/tree/tests/function/ runs a workflow. To create a new test:

touch tree/tests/function/{example.t,example.txt}

First we need to populate example.txt we some text to process (in the appropriate language). Then we can write the actual test in example.t.

Say that we have just implemented a new POS tagger, hunpos, and we want to make sure that it plays nicely with the rest of the tools in LAP; a good test workflow is going to run first all the preprocessing tools needed by the POS tagger, then a tool that depends on it, and finally an export tool so that we can make sure we are getting sane output.

The file example.t will look like this:

from lap.test import TestContext
from lap.utils import laptree

# Notice how the parameter of the TestContext() 
# object is equal to the number of tests; 
# 6 for 6 check_tool() calls.
with TestContext(6) as ctx:
    # the check tool function returns a LAP receipt 
    # that is then used as input for the next processing step
    upload = ctx.check_python('import/lap/text.py', 
                              [laptree('tests/function/eng.txt'), None])
    segmented = ctx.check_tool('nltk', 
                               upload, 
                               __process__='punkt')
    repp = ctx.check_tool('repp', 
                          segmented, 
                          segmenter="nltk_punkt", 
                          style="ptb")
    tagged = ctx.check_tool('hunpos', 
                            repp, 
                            model='eng_wsj.model',
                            segmenter='nltk_punkt', 
                            tokenizer='repp')
    parsed = ctx.check_tool('maltparser', 
                            tagged, 
                            segmenter="nltk_punkt",
                            tokenizer="repp", 
                            pos="hunpos", 
                            model="bm_sp_opt.mco")
    ctx.check_tool('export', 
                   parsed, 
                   __process__='tsv', 
                   sentence='any', 
                   token='any', 
                   format='CoNLL-X')

Notice how the parameter of the TestContext() object is equal to the number of tests: 6 for 6 check_tool() calls. Also note that check_tool() calls return LAP receipts, which are then used as input for downstream tools.

We can now run make from the trunk directory and the test will be run together with the rest of the tests in trunk/tree/tests/function/. However, when debugging we should run the verbose version of the tests, which prints all output (stdout, stderr, receipts and exported files) to stdout.

Running the verbose version of example.t from trunk/:

LAP_TESTS_VERBOSE=1 tree/python/lap/python tree/tests/function/example.t
Clone this wiki locally