Please use this identifier to cite or link to this item:
https://doi.org/10.1109/IPDPS.2012.108
Title: | Meteor shower: A reliable stream processing system for commodity data centers | Authors: | Wang, H. Peh, L.-S. Koukoumidis, E. Tao, S. Chan, M.C. |
Keywords: | fault tolerance reliability stream computing |
Issue Date: | 2012 | Citation: | Wang, H., Peh, L.-S., Koukoumidis, E., Tao, S., Chan, M.C. (2012). Meteor shower: A reliable stream processing system for commodity data centers. Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012 : 1180-1191. ScholarBank@NUS Repository. https://doi.org/10.1109/IPDPS.2012.108 | Abstract: | Large-scale failures are commonplace in commodity data centers, the major platforms for Distributed Stream Processing Systems (DSPSs). Yet, most DSPSs can only handle single-node failures. Here, we propose Meteor Shower, a new fault-tolerant DSPS that overcomes large-scale burst failures while improving overall performance. Meteor Shower is based on checkpoints. Unlike previous schemes, Meteor Shower orchestrates operators' check pointing activities through tokens. The tokens originate from source operators, trickle down the stream graph, triggering each operator that receives these tokens to checkpoint its own state. Meteor Shower is a suite of three new techniques: 1) source preservation, 2) parallel, asynchronous check pointing, and 3) application-aware check pointing. Source preservation allows Meteor Shower to avoid the overhead of redundant tuple saving in prior schemes, parallel, asynchronous check pointing enables Meter Shower operators to continue processing streams during a checkpoint, while application-aware check pointing lets Meteor Shower learn the changing pattern of operators' state size and initiate checkpoints only when the state size is minimal. All three techniques together enable Meteor Shower to improve throughput by 226% and lower latency by 57% vs prior state-of-the-art. Our results were measured on a prototype implementation running three real world applications in the Amazon EC2 Cloud. © 2012 IEEE. | Source Title: | Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012 | URI: | http://scholarbank.nus.edu.sg/handle/10635/42049 | ISBN: | 9780769546759 | DOI: | 10.1109/IPDPS.2012.108 |
Appears in Collections: | Staff Publications |
Show full item record
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.