Capturing Data Uncertainty in High-Volume Stream Processing - Computer Science > DatabasesReportar como inadecuado

Capturing Data Uncertainty in High-Volume Stream Processing - Computer Science > Databases - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Abstract: We present the design and development of a data stream system that capturesdata uncertainty from data collection to query processing to final resultgeneration. Our system focuses on data that is naturally modeled as continuousrandom variables. For such data, our system employs an approach grounded inprobability and statistical theory to capture data uncertainty and integratesthis approach into high-volume stream processing. The first component of oursystem captures uncertainty of raw data streams from sensing devices. Sincesuch raw streams can be highly noisy and may not carry sufficient informationfor query processing, our system employs probabilistic models of the datageneration process and stream-speed inference to transform raw data into adesired format with an uncertainty metric. The second component capturesuncertainty as data propagates through query operators. To efficiently quantifyresult uncertainty of a query operator, we explore a variety of techniquesbased on probability and statistical theory to compute the result distributionat stream speed. We are currently working with a group of scientists toevaluate our system using traces collected from the domains of and eventuallyin the real systems for hazardous weather monitoring and object tracking andmonitoring.

Autor: Yanlei Diao U. Massachusetts-Amherst, Boduo Li University of Massachusetts Amherst, Anna Liu UMass Amherst, Liping Peng UMass Amh


Documentos relacionados