Skip to content

ItsdbTreebanking_ItsdbModeling

FrancisBond edited this page Sep 23, 2005 · 19 revisions

TableOfContents(2)

Training a Scoring Model

If you have treebanked a profile, and have Rob Malouf's [http://bulba.sdsu.edu/malouf/software/maxent.tar.gz MaxEnt package] (in particular the program estimate) installed, then you can train a scoring model which PET (PetTop) can use.

Select the treebanked profile (left-click), or profiles (click in the radio buttons) and then select Trees | Train from the menus. It will prompt you for the filename to put the scoring model in. The tradtion is something like corpus-version.mem. You should have the grammar used for treebanking loaded into the LKB (LkbTop). Training is normally fairly fast.

Scoring

You can compare the ranking of a given profile with a treebanked gold standard (assuming the same test-suite and grammar). The ranking can be changed by changing the scoring model in the parser.

To compare: select the gold standard (middle click), then the profile to be scored as the current database (left click); (make sure the current version of the grammar is loaded into the LKB).

Set: Trees | Switches | Implicit Ranks and Trees | Switches | Result Equivalence; and then go Trees | Score.

 ;; score results in .data. against ground truth in .gold.  operates in
 ;; several slightly distinct modes: (i) using the implicit parse ranking in
 ;; the order of `results' or (ii) using an explicit ranking from the `score'
 ;; relation an orthogonal dimension of variation is (a) scoring by result
 ;; identifier (e.g. within the same profile or against one that is comprised
 ;; of identical results) vs. (b) scoring by derivation equivalence (e.g.
 ;; when comparing best-first parser output against a gold standard).

In order to make the scoring faster, you should do a thinning normalize on the gold profile for comparison first. This thins (implicitly) to only those trees marked as good by the annotator, i.e. you thin out all dis-preferred trees. To get a 5-best comparison, play with the Scoring Beam value.

Using a Scoring Model in PET and the LKB

The scoring model is referenced in cheap's grammar.set for PET:

;;; scoring mechanism (fairly embryonic, for now)
sm := "hinoki.mem".

The scoring model is referenced in the script and globals.lsp for the LKB:

script

;;; if you have [incr tsdb()], load a Maximum Entropy parse selection model
#+:tsdb
(tsdb::read-mem (lkb-pathname (parent-directory) "hinoki.mem"))

globals.lsp

;;; use the parse selection model for selective unpacking
#+:tsdb
(setf *unpacking-scoring-hook* #'tsdb::mem-score-configuration)
Clone this wiki locally