Thesis Proposal
"Pulse: Database Support for Efficient Query Processing of Temporal Polynomial and Differential Equation Models"
Yanif Ahmad
Monday, July 14, 2008 at 10:30 A.M.
Room 368 (3rd Floor CIT)
This thesis investigates the practicality and utility of mathematical models to represent continuous and occasionally unavailable data stream attributes, and processing relational-style queries in a stream processing engine directly on these models. We present Pulse, a framework for processing continuous queries over stream attributes modelled as piecewise polynomial functions and, as part of our proposed work, differential equations. Pulse represents queries as simultaneous equation systems for a variety of relational operators including filters, joins and standard aggregates. We use piecewise polynomials to provide a compact, approximate representation of the input dataset and provide query language extensions for users to specify precision bounds to control this approximation. We have implemented Pulse on top of the Borealis stream processing engine and evaluated it on two realworld datasets from financial and moving object applications. Pulse is able to achieve significant performance improvements by processing queries directly on the mathematical representation of these polynomials and by exploiting precision bounds, in comparison to standard tuple-based stream processing.
The second part of this thesis presents a selectivity estimator and a multi-query optimizer in the unique context of our query processing. Our selectivity estimator works with piecewise polynomials, computing estimates with histograms defined on a parameter space of the polynomials. Our multi-query optimizer uses these selectivities to determine when to construct a global query plan that shares work across individual queries, and is implemented as an adaptive algorithm that locally adapts a global query.
In the final part of this thesis, we propose to study differential equation models, and their analytic and numeric solution. In some cases numerical solutions are represented as interpolating polynomials, enabling reuse of Pulse's query processing capabilities for polynomials. Furthermore numerical techniques produce errors in their solution, which apply in addition to the error between model and tuples. We allow users to declaratively specify precision bounds on query results and determine the error bound for solving any differential equations under-the-hood. Finally we discuss an application, the Black-Scholes PDE, which is used to model option prices in financial applications and often processed through model parameter sensitivity analysis queries.
Host: Ugur Cetintemel
| Page Owner: Webmaster | Last Modified: Wed Jun 18 14:23:46 2008 |