Skip to content

IndraTreebanking

DavidMoeljadi edited this page Jul 21, 2017 · 17 revisions

The following is a step-by-step instruction to do INDRA treebanking:

1. Make a testsuite

General guidelines and formatting: http://compling.hss.ntu.edu.sg/courses/hg7021/testsuites.html

2. The testsuite should be placed in ~/ind/tsdb/skeletons

3. Add testsuite information in ~/ind/tsdb/skeletons/index.lisp

At the end of index.lisp: ((:path . "nameoftestsuite") (:content . "explanation"))

4. Make a shortcut of ind folder in ~/logon/ntu

5. Make a shortcut of skeletons folder in ~/logon/lingo/lkb/src/tsdb/skeletons

Rename that folder into ind

6. Make sure that the paths for ind skeletons etc. are in these files:

  • ~/logon/bin/answer
  • ~/logon/dot.tsdbrc
  • ~/logon/parse

7. Run this command in logon$

~/logon$ ./parse --binary --ind --protocol 2 --best 1 --limit 0 --count 8 mrs
  • 8 is the number of CPU. It can be checked in System Monitor.
  • mrs is the name of testsuite folder.

8. Run [incr tsdb()]

  • Database: Home/logon/lingo/lkb/src/tsdb/home/ind

  • Skeleton: Home/logon/lingo/lkb/src/tsdb/skeletons/ind

    See GeFaqTsdbRc to make changes to the default Database and Skeleton paths

9. Activate external treebanking tool

Trees > Switches > External Treebanking Tool

10. To automatically update:

Compare > Source Database and choose a previously annotated file

then

Trees > Update : Automatic Update

11. To annotate:

Trees > Annotate

The profile will be saved in ~/logon/lingo/lkb/src/tsdb/home/ind/

New way

Based on CapitolHillTreebank

1. Compile the grammar

~/grammar/ind$ ace -g ace/config.tdl -G ind.dat

Check by parsing sentences

~/grammar/ind$ ace -g ind.dat -l

2. Step-by-step command line to FFTB

~/grammar/ind$ mkprof -s tsdb/skeletons/(name of testsuite) /tmp/(name of testsuite)-demo
~/grammar/ind$ art -f -a 'ace --disable-generalization -g ind.dat -O' /tmp/(name of testsuite)-demo/
~/grammar/ind$ vi /tmp/(name of testsuite)-demo/edge
~/grammar/ind$ fftb -g ind.dat --browser --webdir=$LOGONROOT/lingo/answer/fftb /tmp/(name of testsuite)-demo/

Save the result to the gold folder and update

~/grammar/ind$ fftb -g ind.dat /tmp/(name of testsuite)-demo/ --browser --gold tsdb/gold/(name of testsuite)
Clone this wiki locally