Skip to content

LkbEvolution

StephanOepen edited this page Nov 26, 2004 · 9 revisions

Detailed Development Log of Code Changes

This page is intended as a relatively technical log of changes to the LKB source code, in a sense a condensed summary of the CVS revision history. LKB developers will likely find this information most interesting, but it may also help some users to stay up-to-date on latest LKB development.

Fri Nov 26 15:16:31 GMT 2004

Redwoods-Style MRS Banking::

  • discriminant extraction, storage in [incr tsdb()], and the CLIM and HTML UIs now support comparison and annotation of sets of MRSs; discriminants in this mode are triples extraced from the (handle-free) elementary dependency view

    on MRSs, either of the form (predicate, role, predicate) or (predicate, property, value), where the latter form implicitly assumes that the specific property occurs in the ARG0 of the first predicate. to make multiple occurences of the same predicate unique, it is necessary to have a way of `anchoring' each EP to a part of the original input: for the time being, MRS banking depends on the availability of a new LNK property in EPs (see below).

Generation from Fragmented MRSs::

  • in LOGON, there is a notion of `fragmented' MRSs, i.e. a single MRS that corresponds to a sequence of MRSs associated to a sequence of chunks found by the (XLE) parser in robust mode; while each fragment is expected to be semantically well-formed, the individual pieces are connected by virtue of a

    fragment_rel that can be viewed as a highly underspecified conjunction, i.e. uses roles L-HNDL, L-INDEX, R-HNDL, and R-INDEX to encode a binary tree. generation from fragmented MRSs is triggered by fragmentp() of the input MRS to generation, then aims to extract connected fragments, generate from each individually, and cross-multiply the resulting strings. new globals: *semi-fragment-relations* and *fragment-start-symbols*.

Discovery of Orthographic Variants::

  • there is new code to compile a file in the format of variants.tab as it is part of recent versions of the ERG. essentially, the code looks for sets of lexical entries that are equivalent modulo STEM and ONSET and outputs an ordered list of (identifiers of) such lexical entries, with the additional information of whether or not they occur in *duplicate-lex-ids* (the ones following the colon in variants.tab), i.e. are disabled for generation. new global: *orthographic-variants* (a hash table relating such sets); new functions: find-orthographic-variants(), orthographic-variants-p().
Clone this wiki locally