“Elastic State Machine Replication”
IEEE Transactions on Parallel and Distributed Systems, Mar. 2017.
Abstract: State machine replication (SMR) is a fundamental technique for implementing stateful dependable systems. A key limitation of this technique is that the performance of a service does not scale with the number of replicas hosting it. Some works have shown that such scalability can be achieved by partitioning the state of the service into shards. The few SMR-based systems that support dynamic partitioning implement ad-hoc state transfer protocols and perform scaling operations as background tasks to minimize the performance degradation during reconfigurations. In this work we go one step further and propose a modular partition transfer protocol for creating and destroying such partitions at runtime, thus providing fast elasticity for crash and Byzantine fault tolerant replicated state machines and making them more suitable for cloud systems.
Research line(s): Fault and Intrusion Tolerance in Open Distributed Systems (FIT)
Accepted for publication.