Study on Text Information Extraction Model and Algorithm of HTML DocumentsReportar como inadecuado




Study on Text Information Extraction Model and Algorithm of HTML Documents - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

This article improves the automatic data extraction method of Web information based on HTML. This method can extract structure data from non-structure information on the Web. What this article shows is as follows:Firstly, the method of EXALG system is analyzed and its problems are found out.Secondly, the improved EXALG system is provided Thirdly, the privilege of preciseness and completeness of the new system is examined by data resource and experiment results of the author of EXALG.

KEYWORDS

Data Extraction; Equivalent Class; Tagging; Document Object Model

Cite this paper







Autor: Chunyan Li, Haiyang Jiang

Fuente: http://www.scirp.org/



DESCARGAR PDF




Documentos relacionados