Please use this identifier to cite or link to this item:
Title: Optimization Techniques for Complex Multi-query Applications
Keywords: multiple queries, common subexpression, optimization techinques, performance, RDBMS, MapReduce
Issue Date: 20-Jan-2014
Citation: WANG GUOPING (2014-01-20). Optimization Techniques for Complex Multi-query Applications. ScholarBank@NUS Repository.
Abstract: Many applications often involve complex multiple queries which have a lot of common subexpressions (CSEs). As a result, identifying and exploiting the CSEs to improve the query performance is essential in these applications. Multiple query optimization (MQO), which aims to identify the CSEs among multiple queries and exploit them to reduce the overall query evaluation cost, has been extensively studied for over two decades. In this thesis, we present novel MQO techniques that are motivated by three new application contexts. Specifically, we study the following three MQO related problems. First, we study the problem of efficient processing of enumerative set-based queries (SQs). Enumerative SQs aim to find all the sets of entities of interest to meet certain constraints. In this work, we present a novel approach to evaluate enumerative SQs as a collection of cross-product queries (CPQs) and propose efficient and scalable MQO heuristics to optimize the evaluation of a collection of CPQs. Our experimental results demonstrate that our proposed approach is significantlymore efficient than conventional RDBMS methods. To the best of our knowledge, that is the first work that addresses the efficient evaluation of a collection of CPQs. Second, we study multi-query/job optimization techniques and algorithms in the MapReduce framework. In this work, we first propose two new multi-job optimization techniques to share map input scan and map output in the MapReduce paradigm. We then propose a new optimization algorithm that given an input batch of jobs, produces an optimal plan by a judicious partitioning of the jobs into groups and an optimal assignment of the processing technique to each group. Our experimental results on Hadoop demonstrate the efficiency and effectiveness of our proposed techniques and algorithms by comparing with the state-of-the-art techniques and algorithms. Finally, we examine the optimal join enumeration (OJE) problem, which is a fundamental query optimization task for SQL-like queries, in theMapReduce framework. In this work, we study both the single-query and multi-query OJE problems and propose efficient join enumeration algorithms for these problems. The study of the single-query OJE problem serves as a foundation for the study on the multi-query OJE problem. Our experimental results demonstrate the efficiency of our proposed join enumeration algorithms. To the best of our knowledge, this work presents the first systematic study of the OJE problem in the MapReduce paradigm.
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
thesis-wangguoping.pdf1.42 MBAdobe PDF



Page view(s)

checked on Apr 19, 2019


checked on Apr 19, 2019

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.