Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/77892
DC Field | Value | |
---|---|---|
dc.title | Multiquery optimization in mapreduce framework | |
dc.contributor.author | Wang, G. | |
dc.contributor.author | Chan, C.Y. | |
dc.date.accessioned | 2014-07-04T03:10:01Z | |
dc.date.available | 2014-07-04T03:10:01Z | |
dc.date.issued | 2013-11 | |
dc.identifier.citation | Wang, G.,Chan, C.Y. (2013-11). Multiquery optimization in mapreduce framework. Proceedings of the VLDB Endowment 7 (3) : 145-156. ScholarBank@NUS Repository. | |
dc.identifier.issn | 21508097 | |
dc.identifier.uri | http://scholarbank.nus.edu.sg/handle/10635/77892 | |
dc.description.abstract | MapReduce has recently emerged as a new paradigm for large-scale data analysis due to its high scalability, fine-grained fault tolerance and easy programming model. Since different jobs often share similar work (e.g., several jobs s-can the same input file or produce the same map output), there are many opportunities to optimize the performance for a batch of jobs. In this paper, we propose two new tech-niques for multi-job optimization in the MapReduce frame-work. The first is a generalized grouping technique (which generalizes the recently proposed MRShare technique) that merges multiple jobs into a single job thereby enabling the merged jobs to share both the scan of the input file as well as the communication of the common map output. The sec-ond is a materialization technique that enables multiple jobs to share both the scan of the input file as well as the com-munication of the common map output via partial material-ization of the map output of some jobs (in the map and/or reduce phase). Our second contribution is the proposal of a new optimization algorithm that given an input batch of jobs, produces an optimal plan by a judicious partitioning of the jobs into groups and an optimal assignment of the pro-cessing technique to each group. Our experimental results on Hadoop demonstrate that our new approach significantly outperforms the state-of-the-art technique, MRShare, by up to 107%. © 2013 VLDB Endowment. | |
dc.source | Scopus | |
dc.type | Article | |
dc.contributor.department | COMPUTER SCIENCE | |
dc.description.sourcetitle | Proceedings of the VLDB Endowment | |
dc.description.volume | 7 | |
dc.description.issue | 3 | |
dc.description.page | 145-156 | |
dc.identifier.isiut | NOT_IN_WOS | |
Appears in Collections: | Staff Publications |
Show simple item record
Files in This Item:
There are no files associated with this item.
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.