-
Notifications
You must be signed in to change notification settings - Fork 4
ItsdbProfiling
You can generate from a profile with stored MRSes (by, e.g. thinning normalize with (setf tsdb::*tsdb-semantix-hook* "mrs::get-mrs-string").
First select the profile with MRSes as the Gold Profile (middle click). Then create a new profile with the same skeleton, and select it (left click). Set Process---Switches---generate. Then do Process--All Items.
You can pass the items to be parsed through a preprocessor by defining it in the cpu. E.g.
(make-cpu
:host (short-site-name)
:spawn "/path/to/cheap"
:options (list "-tsdb" "-tok=yy" "-packing=7" "-default-les"
(format nil "~a/grammars/japanese/japanese.grm" %delphin%))
:preprocessor "lkb::chasen-preprocess-for-pet"
:class :chasen :grammar "jacy-chasen" :name "jacy-chasen" :threshold 2)
(make-cpu
:host (short-site-name)
:spawn "/path/to/cheap"
:options (list "-tsdb" "-tok=yy" "-packing=7" "-default-les"
(format nil "~a/grammars/japanese/japanese.grm" %delphin%))
:preprocessor "tsdb::rasp-preprocess-for-pet"
:class :rasp :grammar "jacy-rasp" :name "jacy-rasp" :threshold 2)
chasen-preprocess-for-pet and rasp-preprocess-for-pet are lisp functions that take two arguments, the item itself and an optional tagger, and return a tokenized string suitable for pet: in this case the yy-tokenization.
chasen-preprocess-for-pet calls an external morpholigical analyzer (ChaSen) and reformats the output.
rasp-preprocess-for-pet assumes the input is of the form word_pos word_pos and associates each word with its POS in the input chart.
Home | Forum | Discussions | Events