Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/16466
Title: Progressive Query Processing
Authors: TOK WEE HYONG
Keywords: Data Streams, Progressive Joins, Approximate results, XML, High-dimensional, XML
Issue Date: 21-Jan-2009
Citation: TOK WEE HYONG (2009-01-21). Progressive Query Processing. ScholarBank@NUS Repository.
Abstract: Many join processing techniques for data streams have been proposed, the techniques are often designed for a specific data model (e.g. relational), and cannot be easily generalized to other data models. An important criteria to support interactivity, and ensure a good user experience is the progressive production of results (if any) whenever data arrives. In our work, we focus on progressive join processing over data streams with limited memory. In the first problem, we focus on progressive join processing techniques that can be generalized easily for different data models. The problem is motivated by the observation that existing progressive join processing techniques are mostly designed for relational data streams. Thus, new progressive join processing techniques often have to be proposed for new data models. We propose a generic framework for progressive join processing, called the Result Rate based Progressive Join (RRPJ ) framework. The RRPJ framework offers several advantages. Firstly, it allows the generalization of the framework to handle other data models that are non-relational data (e.g. high-dimensional, spatial, XML). Secondly, as it does not require a local uniformity assumption in each of the data partitions. Based on the RRPJ framework, we examine various instantiations of the RRPJ framework for four data models: relational, spatial, high-dimensional and XML data. In the second problem, we focus on progressive, approximate join processing. This is motivated by the observation that due to the infinite nature of data streams, users do not need the complete results. An approximate result is often sufficient. Users expect the approximate results to be either the largest possible or the most representative (or both) given the resources available. We examine the tradeoffs between maximizing the result quantity and quality and propose a stratified sampling-based progressive approximate join algorithm. In the third problem, we focus on progressive, approximate join processing over sliding window of data. We propose a sampling-based framework for sliding window joins over data streams. We present both empirical and theoretical analysis for each of the sliding-window sampling techniques.
URI: http://scholarbank.nus.edu.sg/handle/10635/16466
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
phd-thesis.pdf1.99 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.