Skip to content

ItsdbReference

FrancisBond edited this page May 9, 2005 · 25 revisions

Reference

This page include some low level information about itsdb (ItsdbTop).

tsdb database format

The database consists of multiple tables. Each table is a text file, consisting of multiple rows. Each row consists of with fields separated by an @, the whole row is terminated by a newline. The mappings of columns to identifiers is given in the relations file.

item file format

Here is the structure, along with some examples of values.

Field Name Explanation Example Value
1: i-id ID integer
2: i-origin Origin none
3: i-register Register formal
4: i-format Format none
5: i-difficulty Difficulty 1
6: i-category Category S,XP
7: i-input String
8: i-wf Grammaticality judgement 0,1
9: i-length String length (words) integer
10: i-comment Comment
11: i-author Author uname
12: i-date Date created 5-8-2003

An actual entry:

1@csli@formal@none@1@S@Abrams works .@1@2@@@jul-98

Note that [itsdb] does not always check that the i-ids are unique, but they should always be kept unique. Also, it is a good idea to keep the items sorted.

In the Hinoki project, the i-comment is used to give the source of the utterance (definition sentence, example, other corpus), the ID in the source corpus, and, for definition and examples sentences, some information about the headword being defined or exemplified.

Clone this wiki locally