“Byzantine State Machine Replication for the Masses”
Ph.D. dissertation, Faculdade de Ciências, Universidade de Lisboa, 2017
Abstract: The state machine replication technique is a popular approach for building Byzantine fault-tolerant services. However, despite the widespread adoption of thisparadigm for crash fault-tolerant systems, there are still few examples of this paradigm for real Byzantine fault-tolerant systems. Our view of this situation is that there is a lack of robust implementations of Byzantine fault-tolerant state machine replication middleware, and that the performance penalty is too high, specially for geo-replication. These hindrances are tightly coupled to the distributed protocols used for enforcing such resilience. This thesis has the objective of finding methodologies for enhancing robustness and performance of state machine replication systems. The first contribution is Mod-SMaRt, a modular protocol that preserves optimal latency in terms of the communications steps exchanged among processes. By being a modular protocol, it becomes simpler to validate and implement, thus resulting in greater robustness; by also preserving optimal message-exchanges among processes, the protocol is capable of delivering desirable performance. The second contribution is concerned with implementing Mod-SMaRt into BFT-SMaRt, a reliable and high-performance codebase that was maintained and improved over the entire course of the PhD that offers multicore-awareness, reconfiguration support, and a flexible API. The third contribution presents WHEAT, a protocol derived from Mod-SMaRt that uses optimizations shown to be effective in reducing latency via a practical evaluation conducted in a geo-distributed environment. We additionally conducted an evaluation of both BFT-SMaRt and WHEAT applied to a relational database middleware and an ordering service for a per missioned blockchain platform. These evaluations revealed encouraging results for both systems and validated our work conducted in the geo-distributed context.
Research line(s): Fault and Intrusion Tolerance in Open Distributed Systems (FIT)