Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/77892
DC FieldValue
dc.titleMultiquery optimization in mapreduce framework
dc.contributor.authorWang, G.
dc.contributor.authorChan, C.Y.
dc.date.accessioned2014-07-04T03:10:01Z
dc.date.available2014-07-04T03:10:01Z
dc.date.issued2013-11
dc.identifier.citationWang, G.,Chan, C.Y. (2013-11). Multiquery optimization in mapreduce framework. Proceedings of the VLDB Endowment 7 (3) : 145-156. ScholarBank@NUS Repository.
dc.identifier.issn21508097
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/77892
dc.description.abstractMapReduce has recently emerged as a new paradigm for large-scale data analysis due to its high scalability, fine-grained fault tolerance and easy programming model. Since different jobs often share similar work (e.g., several jobs s-can the same input file or produce the same map output), there are many opportunities to optimize the performance for a batch of jobs. In this paper, we propose two new tech-niques for multi-job optimization in the MapReduce frame-work. The first is a generalized grouping technique (which generalizes the recently proposed MRShare technique) that merges multiple jobs into a single job thereby enabling the merged jobs to share both the scan of the input file as well as the communication of the common map output. The sec-ond is a materialization technique that enables multiple jobs to share both the scan of the input file as well as the com-munication of the common map output via partial material-ization of the map output of some jobs (in the map and/or reduce phase). Our second contribution is the proposal of a new optimization algorithm that given an input batch of jobs, produces an optimal plan by a judicious partitioning of the jobs into groups and an optimal assignment of the pro-cessing technique to each group. Our experimental results on Hadoop demonstrate that our new approach significantly outperforms the state-of-the-art technique, MRShare, by up to 107%. © 2013 VLDB Endowment.
dc.sourceScopus
dc.typeArticle
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.sourcetitleProceedings of the VLDB Endowment
dc.description.volume7
dc.description.issue3
dc.description.page145-156
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.