Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/16466
Title: | Progressive Query Processing | Authors: | TOK WEE HYONG | Keywords: | Data Streams, Progressive Joins, Approximate results, XML, High-dimensional, XML | Issue Date: | 21-Jan-2009 | Citation: | TOK WEE HYONG (2009-01-21). Progressive Query Processing. ScholarBank@NUS Repository. | Abstract: | Many join processing techniques for data streams have been proposed, the techniques are often designed for a specific data model (e.g. relational), and cannot be easily generalized to other data models. An important criteria to support interactivity, and ensure a good user experience is the progressive production of results (if any) whenever data arrives. In our work, we focus on progressive join processing over data streams with limited memory. In the first problem, we focus on progressive join processing techniques that can be generalized easily for different data models. The problem is motivated by the observation that existing progressive join processing techniques are mostly designed for relational data streams. Thus, new progressive join processing techniques often have to be proposed for new data models. We propose a generic framework for progressive join processing, called the Result Rate based Progressive Join (RRPJ ) framework. The RRPJ framework offers several advantages. Firstly, it allows the generalization of the framework to handle other data models that are non-relational data (e.g. high-dimensional, spatial, XML). Secondly, as it does not require a local uniformity assumption in each of the data partitions. Based on the RRPJ framework, we examine various instantiations of the RRPJ framework for four data models: relational, spatial, high-dimensional and XML data. In the second problem, we focus on progressive, approximate join processing. This is motivated by the observation that due to the infinite nature of data streams, users do not need the complete results. An approximate result is often sufficient. Users expect the approximate results to be either the largest possible or the most representative (or both) given the resources available. We examine the tradeoffs between maximizing the result quantity and quality and propose a stratified sampling-based progressive approximate join algorithm. In the third problem, we focus on progressive, approximate join processing over sliding window of data. We propose a sampling-based framework for sliding window joins over data streams. We present both empirical and theoretical analysis for each of the sliding-window sampling techniques. | URI: | http://scholarbank.nus.edu.sg/handle/10635/16466 |
Appears in Collections: | Ph.D Theses (Open) |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
phd-thesis.pdf | 1.99 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.