Harnessing the Deep Web: Present and Future - Computer Science > DatabasesReportar como inadecuado




Harnessing the Deep Web: Present and Future - Computer Science > Databases - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Abstract: Over the past few years, we have built a system that has exposed largevolumes of Deep-Web content to Google.com users. The content that our systemexposes contributes to more than 1000 search queries per-second and spans over50 languages and hundreds of domains. The Deep Web has long been acknowledgedto be a major source of structured data on the web, and hence accessingDeep-Web content has long been a problem of interest in the data managementcommunity. In this paper, we report on where we believe the Deep Web providesvalue and where it does not. We contrast two very different approaches toexposing Deep-Web content - the surfacing approach that we used, and thevirtual integration approach that has often been pursued in the data managementliterature. We emphasize where the values of each of the two approaches lie andcaution against potential pitfalls. We outline important areas of futureresearch and, in particular, emphasize the value that can be derived fromanalyzing large collections of potentially disparate structured data on theweb.



Autor: Jayant Madhavan Google Inc., Loredana Afanasiev Universiteit van Amsterdam, Lyublena Antova Cornell University, Alon Halevy Googl

Fuente: https://arxiv.org/



DESCARGAR PDF




Documentos relacionados