Techniques

  1. System Architecture:

    As stated in the project overview, the CLARO system involves two processes: data capturing and transformation and relational processing under uncertainty. The below figure shows the architecture overview of CLARO, an uncertainty-aware stream system. The T operators transform raw data streams into queriable data streams with quantified uncertainty. The A and T operators are examples of relational operators, i.e., aggregates and joins respectively. These operators manipulate and process data modelled by continuous random variables. The final results can be characterized by confidence regions or other statistics such as mean and variance values.

    system architecture

  2. Data Capturing and Transformation:

    Since the raw streams may not present data in a format suitable for query processing and can be highly noisy, this project employs probabilistic models of the underlying data generation process and machine learning techniques to efficiently transform raw data into a desired representation with an uncertainty metric. The following figure shows a graphical model built for the RFID application.

    system architecture
  3. Relational Processing under Uncertainty:

    To efficiently quantify result uncertainty of a query operator, CLARO explores various techniques based on probability and statistical theory to reduce statistics that data streams need to carry and to expedite the computation of result distributions using approximation. Examples of techniques applied are characteristic functions and regressions..