Skip to content

BarcelonaEvaluation

MontserratMarimon edited this page Jul 15, 2009 · 10 revisions

Discussion: Evaluation

Moderator: Montserrat Marimon; Scribe: Rebecca Dridan

Objective

Share ideas among grammarians about evaluation:

1. Diagnosis evaluation (regressing testing while developing):

Test cases which include all variations of a particular phenomenon and related ungrammatical constructions: linguistically interesting – controlled hand-built test suites.

Test data that present combinations of different phenomena, showing the real world language complexity (not linguistically interesting) – real text.

2. Direct evaluation (grammar performance in terms of recall and precision, robustness,... ).

Using a Treebank... built by chosen sentences or by real text showing interaction of phenomena?

Evaluation through a participation in international competitions - CoNLL, SemEval (different tasks): Allows to compare our systems with other systems, but format conversion may not be trivial, different criteria, errors (not systematic).

3- Indirect evaluation (grammar usability/adequacy for a specific application).

Notes

Clone this wiki locally