Wrapping Web Pages into XML Documents: A Practical Experience and Comparison of Two ToolsReportar como inadecuado




Wrapping Web Pages into XML Documents: A Practical Experience and Comparison of Two Tools - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 CSIRO CMIS - CSIRO Mathematical and Information Sciences

Abstract : he notion of wrapping a web server to produce XML documents from unstructed web pages is driven by the need to produce structured data that can be used by a variety of applications. The web contains vast amounts of information that cannot be used by most applications as it targets a human audience. A solution to this is to automate the browsing process and convert the unstructured extracted information into a more structured format such as XML. This is called wrapping. We have used two different tools to wrap several tourist sites into XML The tools we have used are Norfolk, a system developed by the CSIRO TED group and W4F, initially developed at the University of Pennsylvania and now a commercial product. This report describes our practical experience with the tools and compares them. The comparison highlights features required by a wrapper system to support real applications.

Keywords : Wrapper Semi-structured documents XML Web Information





Autor: Sabine Jabbour - Anne-Marie Vercoustre -

Fuente: https://hal.archives-ouvertes.fr/



DESCARGAR PDF




Documentos relacionados