Scalable State Machine Replication Revisited

Decanato - Facoltà di scienze informatiche

Data: 5 Luglio 2022 / 13:30 - 15:00

You are cordially invited to attend the PhD Dissertation Defence of Mojtaba Eslahi Kelorazi on Tuesday 5 July 2022 at 13:30 online on MS Teams.

The requirements for availability, performance, and latency in today’s online services are strict. State machine replication (SMR), a fundamental technique for increasing the availability of services without compromising consistency, offers configurable availability but limited scalability in terms of performance. Scalability in SMR is limited due to the fact that every replica has to execute the same set of requests, and therefore adding servers does not increase the maximum throughput. Scalable State Machine Replication (S-SMR) systems achieve scalable performance by partitioning the service state and coordinating the ordering and execution of commands to preserve the default consistency guarantee of SMR. While current S-SMR systems scale performance of single-partition requests with the number of partitions, replica coordination and object migration incur substantial overhead in the execution of multi-partition requests. In this thesis, we first develop DynaTree, a distributed B+Tree algorithm over state-of-the-art S-SMR systems to study the implications of the partitioned SMR model on the development of complex distributed applications. We then look into improving performance and reducing latency of S-SMR systems. We leverage RDMA technology to enable systems with enhanced communication performance. RDMA provides the potential for high throughput and low latency communication by bypassing the kernel and implementing network stack layers in hardware. In this direction, we propose and implement two novel systems: (i) RamCast, the first genuine atomic multicast protocol tailor-made for shared-memory, and (ii) Heron, the first scalable state machine replication system on shared memory. RamCast leverages RDMA to reduce the latency of atomic multicast to microseconds by using RDMA mechanisms to protect memory from concurrent writes. Heron relies on RamCast to consistently order and deliver requests at partitions. It builds on RDMA’s shared memory to coordinate and execute distributed operations while ensuring strong consistency. The performance evaluation of the proposed systems show substantial improvement in comparison to their message-passing variants.

Dissertation Committee:
- Prof. Fernando Pedone, Università della Svizzera italiana, Switzerland (Research Advisor)
- Prof. Patrick Thomas Eugster, Università della Svizzera italiana, Switzerland (Internal Member)
- Prof. Laura Pozzi, Università della Svizzera italiana, Switzerland (Internal Member)
- Prof. Philippe Cudré-Mauroux, University of Fribourg, Switzerland (External Member)
- Prof. Paolo Romano, University of Lisbon, Portugal (External Member)