Expanding lexicons by inducing paradigms and validating attested formsReportar como inadecuado




Expanding lexicons by inducing paradigms and validating attested forms - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 Inria Saclay - Ile de France

Abstract : One of the bottlenecks in Natural Language Processing for a given language is creating a lexicon that covers the language. The morphological lexicon provides two important pieces of information for NLP applications: 1) the normalization of a word, its lemmatization, which allows the application to recognize two variants of the same word; and 2) the part-of-speech roles that the word can play, which allows the application to parse the text, creating relations between the words in a text. Many NLP applications, e.g. Information Retrieval, Classification, Terminology Extraction, etc., depend upon the normalization and parsing information found in lexicons. When words are not present in these lexicons, it is difficult to predict what their proper lemmatizations and parts-of-speech are. In this paper we present a technique for updating a lexicon given an unknown word via induction of paradigms from an existing, but incomplete, lexicon and validation of the paradigm using corpus evidence.

Keywords : natural language processing lexicography computational linguistics dictionary lexicon





Autor: Gregory Grefenstette - Yan Qu David Evans

Fuente: https://hal.archives-ouvertes.fr/



DESCARGAR PDF




Documentos relacionados