-
Notifications
You must be signed in to change notification settings - Fork 4
LkbLtdb
The Lexical Type Database is a web interface for grammars made with the [wiki:LkbTop LKB].
This page is a very rough initial introduction.
The Lextype DB, describes lexical types of the grammar and treebank. Lexical types can be seen as de- tailed parts-of-speech and are the essence for the two important points just mentioned. Information about a lexical type that the Lextype DB provides includes its linguistic characteristics; examples of usage from a treebank; the way it is implemented in a grammar; and correspon- dences to major computational dictionaries. It consists of a database management system and a web-based interface, and is constructed semi- automatically. Currently, we have applied the Lextype DB to grammars and treebanks of Japanese and English.
The software was written by ChikaraHashimoto and FrancisBond, and uses the html output provided by StephanOepen.
The minimal Lexical Type Database offers the following:
- a web interace to lexical types in a DELPH-IN grammar, including examples from the lexicon.
- in-line documentation from the tdl file:
- human readable name
- description
- example sentences
- todo
- links to treebanks
- words and lexical types in context
Earlier versions of the lexical type database also included links to external references and other lexicons. We hope to revive them at some stage.
; <type val="n_-_c_le">
; <description>Intransitive count noun (icn)
; <ex>The dog barked.
; <nex>
; <todo>
; </type>
n_-_c_le := n_intr_lex_entry.
; <type val="case-p-lex-np-kara">
; <name-ja>承名詞受身主格助詞
; <description>名詞の直後について、受身文の主格(実際にその行為を行うもの)を表す助詞「から」。
; <ex>子供 が 親 から たしなめ られる
; <nex>友人 から 自転車 を 買う
; <todo>(07-03-30)間接受身でも使えるようにすべき。(lkb::do-parse-tty "親戚 から 怒ら れる")
; (07-03-30)「〜」はこのtypeでよいのか?(格として取ることがないため)
; (07-03-30)postp-lexの後につくtypeも必要。(lkb::do-parse-tty "子供 が 親 とか から たしなめ られる")
; </type>
case-p-lex-np-kara := case-p-lex-np &
[SYNSEM.LOCAL.CAT.HEAD.CASE kara-case].
Currently the Lexical Type Database is distributed with the [wiki:LkbTop LKB], in lkb/src/ltdb. There is a README file that describes how to build the database. In summary:
./make-ltdb.bash --grm GRAMMAR
E.g.
./make-ltdb.bash --grm jacy
If you have any gold treebanks
./make-trees.bash --grm GRAMMAR
(slow if you have a lot of trees, needs a fair bit of memory) Note: if the current grammar version is very different to that used to make the treebanks, many trees will not be exported.
Everything is installed to ~/public_html/GRAMMAR_VERSION
- LKB (to dump the lexicon and type files)
- Perl
- DBD::SQLite
- XML::DOM
- SQLite3
- Apache (for the web server)
- nsgmls for validation (package sp)
In ubuntu you can satisfy the dependencies by installing [wiki:LogonInstallation LOGON] and the following packages:
sudo apt-get install libdbd-sqlite3-perl sp libxml-dom-perl apache2
To enable CGI in user directories, add the following lines to the appropriate Apache configuration file. That could be /etc/apache2/httpd.conf, or more correctly, the appropriate file in /etc/apache2/site-enabled/.
<Directory /home/*/public_html/cgi-bin/>
Options ExecCGI
SetHandler cgi-script
</Directory>
-
Chikara Hashimoto, Francis Bond, and Dan Flickinger (2007)
- [http://www2.nict.go.jp/x/x161/en/member/bond/pubs/2007-IWIC-lextypedb.pdf The lextype DB: A web-based framework for collaborative multilingual grammar and treebank development]. In The First International Workshop on Intercultural Collaboration (IWIC-2007), pages 44–58, Kyoto.
Chikara Hashimoto, Francis Bond, and Melanie Siegel (2007)
- [http://www2.nict.go.jp/x/x161/en/member/bond/pubs/2007-LRE-lextypedb.pdf Semi-automatic documentation of an implemented linguistic grammar augmented with a treebank]. Language Resources and Evaluation. (Special issue on Asian language technology)
Chikara Hashimoto, Francis Bond, Takaaki Tanaka, and Melanie Siegel (2005)
- [http://www2.nict.go.jp/x/x161/en/member/bond/pubs/2005-linc-lextypedb.pdf Integration of a lexical type database with a linguistically interpreted corpus]. In 6th International Workshop on Linguistically Interpreted Corpora (LINC-2005), 31--40, Cheju, Korea.
- finish the documentation
-
add screenshots
-
link to some running Lexical Type Databases (like [http://wiki.delph-in.net/moin/JacyLexTypes this])
-
- index all rules not just lexical rules, and allow them to be looked up.
- warn if grammar version and treebank version differ
Home | Forum | Discussions | Events