DiscoMT 2015 Shared Task Pronoun Evaluation and Annotation

The Workshop on Discourse and Machine Translation (DiscoMT) at EMNLP 2015 in Lisbon featured a shared task on pronoun prediction and translation for the English-French language pair. The shared task attracted 8 submissions to the prediction subtask and 6 submissions to the translation subtask and served to characterise the current state of the art in pronoun handling for MT. Since pronoun translations are notoriously difficult to evaluate using standard automatic evaluation metric, we had to rely on manual evaluations of pronoun correctness. This EAMT grant was used to fund the manual evaluation of the shared task. We also created annotations of pronoun coreference over the English part of the shared task test set, a corpus of 12 TED conference talks available as parallel text in multiple languages. The setup and results of the shared task and the annotation procedure are described in detail in the proceedings of the DiscoMT workshop (http://www.aclweb.org/anthology/W/W15/W15-2501.pdf). All annotations created in the project, including the manually evaluated shared task submissions and the test set with pronoun coreference annotations, are included in a public data release (http://hdl.handle.net/11372/LRT-1611).