Influence of Pre-annotation on POS-tagged Corpus DevelopmentReportar como inadecuado




Influence of Pre-annotation on POS-tagged Corpus Development - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 INIST - Institut de l-information scientifique et technique 2 LIPN - Laboratoire d-Informatique de Paris-Nord 3 ALPAGE - Analyse Linguistique Profonde à Grande Echelle ; Large-scale deep linguistic processing Inria Paris-Rocquencourt, UPD7 - Université Paris Diderot - Paris 7

Abstract : This article details a series of carefully designed experiments aiming at evaluating the influence of automatic pre-annotation on the manual part-of-speech annotation of a corpus, both from the quality and the time points of view, with a specific attention drawn to biases. For this purpose, we manually annotated parts of the Penn Treebank corpus under various experimental setups, either from scratch or using various pre-annotations. These experiments confirm and detail the gain in quality observed before, while showing that biases do appear and should be taken into account. They finally demonstrate that even a not so accurate tagger can help improving annotation speed.





Autor: Karën Fort - Benoît Sagot -

Fuente: https://hal.archives-ouvertes.fr/



DESCARGAR PDF




Documentos relacionados