“On the Design of Resilient Multicloud MapReduce”
IEEE Cloud Computing, 2017.
Abstract: MapReduce is a popular distributed data-processing system for analyzing big data in cloud environments. This platform is often used for critical data processing, e.g., in the context of scientific or financial simulation. Unfortunately, there is accumulating evidence of severe problems – including arbitrary faults and cloud outages – affecting the services that run atop cloud services. Faced with this challenge, we have recently explored multicloud solutions to increase the resilience and availability of MapReduce. Based on this experience, we present system design guidelines that allow to scale out MapReduce computation to multiple clouds in order to tolerate arbitrary and malicious faults, as well as cloud outages. Crucially, the techniques we introduce have reasonable cost and do not require changes to MapReduce or to the users’ code, enabling immediate deployment.
Research line(s): Fault and Intrusion Tolerance in Open Distributed Systems (FIT)