“Replication for dependability on virtualized cloud environments”
in Proceedings of the 10th International Workshop on Middleware for Grids, Clouds and e-Science, Montreal, Quebec, Canada, Dec. 2012, pp. 2:1–2:6.http://doi.acm.org/10.1145/2405136.2405138.
Abstract: Execution of critical services traditionally requires multiple distinct replicas, supported by independent network and hardware. To operate properly, these services often depend on the correctness of a fraction of replicas, usually over 2/3 or 1/2. Defying the ideal situation, economical reasons may tempt users to replicate critical services onto a single multi-tenant cloud infrastructure. Since this may expose users to correlated failures, we assess the risks for two kinds of majorities: a conventional one, related to the number of replicas, regardless of the machines where they run; and a second one, related to the physical machines where the replicas run. This latter case may exist in multi-tenant virtualized environments only. We evaluate crash-stop and Byzantine faults that may affect virtual machines or physical machines. Contrary to what one might expect, we conclude that replicas do not need to be evenly distributed by a fixed number of physical machines. On the contrary, we found cases where they should be as unbalanced as possible. We try to systematically identify the best defense for each kind of fault and majority to conserve.
Research line(s): Timeliness and Adaptation in Dependable Systems (TADS)