“Byzantine Fault-Tolerant MapReduce: Faults are Not Just Crashes”

From Navigators

(Difference between revisions)
Jump to: navigation, search
 
(3 intermediate revisions not shown)
Line 3: Line 3:
|document=CostaPBC11.pdf
|document=CostaPBC11.pdf
|title=Byzantine Fault-Tolerant MapReduce: Faults are Not Just Crashes
|title=Byzantine Fault-Tolerant MapReduce: Faults are Not Just Crashes
-
|author=Costa, Pedro; Pasin, Marcelo; Bessani, Alysson N. & Correia, Miguel
+
|author=Pedro Costa, Marcelo Pasin, Alysson Bessani, Miguel Correia,
-
|Project=TCLOUDS Trustworthy Clouds Privacy and Resilience for Internet-scale Critical Infrastructure.
+
|Project=Project:TCLOUDS,
|ResearchLine=Fault and Intrusion Tolerance in Open Distributed Systems (FIT)
|ResearchLine=Fault and Intrusion Tolerance in Open Distributed Systems (FIT)
|year=2011
|year=2011
 +
|abstract=MapReduce is often used to run critical jobs such as scientific data analysis. However, evidence in the literature shows that arbitrary faults do occur and can probably corrupt the results of MapReduce jobs. MapReduce runtimes
 +
like Hadoop tolerate crash faults, but not arbitrary or Byzantine faults. We present a MapReduce algorithm and prototype that tolerate these faults. An experimental evaluation shows that the execution of a job with our algorithms uses twice the resources of the original Hadoop, instead of the 3 or 4
 +
times more that would be achieved with the direct application of common Byzantine fault-tolerance paradigms. We believe this cost is acceptable for critical applications that require that level of fault tolerance.
|journal=IEEE Third International Conference on Cloud Computing Technology and Science (CLOUDCOM '11)
|journal=IEEE Third International Conference on Cloud Computing Technology and Science (CLOUDCOM '11)
|address=Athens, Greece
|address=Athens, Greece
 +
|booktitle=Proceedings of the 3rd IEEE International Conference on Cloud Computing and Science - CloudCom’11
}}
}}

Latest revision as of 15:42, 21 January 2013

Pedro Costa, Marcelo Pasin, Alysson Bessani, Miguel Correia

in Proceedings of the 3rd IEEE International Conference on Cloud Computing and Science - CloudCom’11, Athens, Greece, 2011.

Abstract: MapReduce is often used to run critical jobs such as scientific data analysis. However, evidence in the literature shows that arbitrary faults do occur and can probably corrupt the results of MapReduce jobs. MapReduce runtimes like Hadoop tolerate crash faults, but not arbitrary or Byzantine faults. We present a MapReduce algorithm and prototype that tolerate these faults. An experimental evaluation shows that the execution of a job with our algorithms uses twice the resources of the original Hadoop, instead of the 3 or 4 times more that would be achieved with the direct application of common Byzantine fault-tolerance paradigms. We believe this cost is acceptable for critical applications that require that level of fault tolerance.

Download paper

Download Byzantine Fault-Tolerant MapReduce: Faults are Not Just Crashes

Export citation

BibTeX

Project(s): Project:TCLOUDS

Research line(s): Fault and Intrusion Tolerance in Open Distributed Systems (FIT)

Personal tools
Navigators toolbox