Skip to content

ItsdbDerivations

StephanOepen edited this page Mar 13, 2009 · 17 revisions

Overview

The itsdb environment records information about derivations (the 'recipes' of linguistic analyses) in its database. Combined with the grammar originally used to derive each analysis, the derivation structure needs to provide complete information for re-building the analysis. In other words, the derivation can serve as an oracle to a process that one can conceptualize as deterministic parsing: the derivation records exactly which steps the original itsdb client processor had taken in producing its analysis. Deterministically re-building (or re-playing) an analysis, thus, will give rise to the exact same structure as was associated with the original result.

In principle, the itsdb derivation format is applicable to any kind of processing client (be it the LKB, PET, TRALE, or the XLE) and all types of processing (e.g. parsing, generation, transfer, or translation). However, in practice (as of early 2009) only parsing and generation derivations produced by either the LKB or PET are fully supported.

This page documents the format used internally by itsdb to record derivations (this specification is sometimes half-jokingly referred to as Unified Derivation Format or UDF). This page was predominantly authored by StephanOepen, who jointly with UlrichCallmeier developed the original UDF 1.0 specification. Please do not make substantial changes unless you (a) are reasonably sure of the technical correctness of your revisions and (b) believe strongly that your changes are compatible with the general design and recommended use patterns for itsdb, and of course with the goals of this page.

An Example

Following is an example derivation taken from the WeScience treebank.

Clone this wiki locally