Uncertain data streams, where data is incomplete, imprecise, and even misleading, have been observed in a variety of environments. In many cases, the raw data collected is not directed queriable and hence, needs to undergo sophisticated query processing to derive useful high-level information. Also, feeding uncertain data streams directly to existing stream systems can produce results of unknown quality. The goal of this project is to design and develop a stream processing system that captures data uncertainty from data collection to query processing to final result generation. This project takes a principled approach grounded in probability and statistical theory to support uncertainty as a first-class citizen, and efficiently integrate this approach into high-volume stream processing. Specifically, the project has two following main contributions:

  1. To capture uncertainty of raw data streams emanating from sensing devices.
  2. To capture uncertainty as data propagates through various query processing operators.

Funding Sources

We gratefully acknowledge the funding provided by the following agency:

National Science Foundation.

  1. CAREER: Efficient, Robust RFID Stream Processing for Tracking and Monitoring. Yanlei Diao (PI). National Science Foundation IIS-0746939.
  2. III-COR-small: Capturing Data Uncertainty in High-Volume Stream Processing. Yanlei Diao (PI) and Anna Liu (co-PI). National Science Foundation IIS-0812347.

Last Update: July 2014