QUERY PROCESSING AND OPTIMIZATION IN MULTIPROCESSOR ENVIRONMENT | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/183137

Title:	QUERY PROCESSING AND OPTIMIZATION IN MULTIPROCESSOR ENVIRONMENT
Authors:	TAN KIAN LEE
Issue Date:	1992
Citation:	TAN KIAN LEE (1992). QUERY PROCESSING AND OPTIMIZATION IN MULTIPROCESSOR ENVIRONMENT. ScholarBank@NUS Repository.
Abstract:	This thesis explores two levels of parallelism on the costly join operation in database query processing: (1) intra-join parallelism, where several processors are assigned to a join operation to parallelize it and (2) inter-join parallelism, where several joins may be concurrently executed. While intra-join parallelism is employed to develop efficient parallel join algorithms, both levels of parallelism are used to generate parallel execution plans for multi-way join queries. The problems are studied for a general purpose multiprocessor computer with shared memory. For such computers, the amount of memory allocated to the join operation is proportional to the number of processors assigned to the operation. Furthermore, all processors participating in the join operation have access to the memory allocated for the operation. The system also permits concurrent read from but exclusive write to the shared memory. A locking mechanism is used to regulate conflicting writes. To explore intra-join parallelism, four variants of the traditional hash-based join algorithms are proposed and studied - hybrid hash-join, modified hybrid hash-join, simple hash-join and hash-based nested-loops join. These algorithms are different from those of shared-nothing architectures in that a global hash table is built in shared memory. The concurrent update and access to this global hash table is studied. To study the performance of the algorithms, the elapsed time and total processing time are modeled analytically. An important feature of the analysis is that the overlap of computations and disk transfers is modeled in the analysis of the elapsed time. The results indicate that, hybrid hash-join that outperforms oilier hash-based algorithms in uniprocessor systems does not always performs the best. A simpler algorithm, hash-based nested-loops join, performs better in terms of elapsed time when both the relations are of similar sizes. For multi-way join queries, the query execution plans that explore both intra- and inter-join parallelism should have flat and bushy structure. To generate such plans, a greedy algorithm, Greedy Parallel (GP), is proposed. Algorithm GP employs a cost function to select as many pairs of relations to be joined in parallel as possible and repeats this process until the final join result is obtained. The optimization process determines the order and method in which each join should be performed, the number of joins should be executed in parallel, and the number of processors should be allocated to each join. Four criteria are considered in the selection process for the pairs to be joined. Our study shows that all the four criteria are relatively close in performance. Two functions - total cost evaluation and partial cost evaluation - are also studied as heuristics in searching for optimal parallel plans. The study shows that algorithm GP used in the optimizer usually generate optimal or near-optimal plans when the number of joins is relatively small and the optimization overhead is much lesser compared to the exhaustive search approach where all possible plans are considered. The total cost evaluation function provides a better heuristic for algorithm GP by generating better plans but at a higher computational cost than the partial cost evaluation function. Even when the number of joins increases, the algorithm still gives reasonably good performance.
URI:	https://scholarbank.nus.edu.sg/handle/10635/183137
Appears in Collections:	Master's Theses (Restricted)

Show full item record

Files in This Item:

File	Description	Size	Format	Access Settings	Version
b19521157.pdf		4.4 MB	Adobe PDF	RESTRICTED	None	Log In

Google Scholar^TM

Check

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.