Skip to content

BarcelonaEvaluation

MontserratMarimon edited this page Jul 15, 2009 · 10 revisions

Discussion: Evaluation

Moderator: Montserrat Marimon; Scribe: Rebecca Dridan

Objective

Share ideas among grammarians about evaluation:

1. Diagnosis evaluation (regressing testing while developing):

Test cases which include all variations of a particular phenomenon and related ungrammatical constructions: linguistically interesting – controlled hand-built test suites.

Test data that present combinations of different phenomena, showing the real world language complexity (not linguistically interesting) – real text.

2. Direct evaluation (grammar performance in terms of recall and precision (robustness)).

Using a Treebank... Built by chosen sentences or by real text, showing interaction of phenomena: What kind of corpus? Newspaper articles (syntactic errors, misspellings, foreign words...), domain specific (missing lexical entries), mixture

Evaluation through a participation in international competitions: CoNLL, SemEval (different tasks): Allows to compare our systems with other systems, but format conversion may not be trivial, different criteria, errors (not systematic).

3- Indirect evaluation (grammar usability/adequacy for a specific application).

Notes

Clone this wiki locally