Leaderless replication allows any replica to handle any type of request to achieve read scalability and high availability for distributed data stores. However, this entails burdensome coordination overhead of replication protocols, degrading write throughput. In addition, the data store still requires coordination for membership changes, making it hard to resolve server failures quickly. To this end, we present NetLR, a replicated data store architecture that supports high performance, fault tolerance, and linearizability simultaneously. The key idea of NetLR is moving the entire replication functions into the network by leveraging the switch as an on-path in-network replication orchestrator. Specifically, NetLR performs consistency-aware read scheduling, high-performance write coordination, and active fault adaptation in the network switch. Our in-network replication eliminates inter-replica coordination for writes and membership changes, providing high write performance and fast failure handling. NetLR can be implemented using programmable switches at a line rate with only 5.68% of additional memory usage. We implement a prototype of NetLR on an Intel Tofino switch and conduct extensive testbed experiments. Our evaluation results show that NetLR is the only solution that achieves high throughput and low latency and is robust to server failures.
|Number of pages||13|
|Publication status||Published - 2022|
|Event||48th International Conference on Very Large Data Bases, VLDB 2022 - Sydney, Australia|
Duration: 2022 Sept 5 → 2022 Sept 9
Bibliographical noteFunding Information:
We would like to thank the anonymous reviewers for providing insightful comments. This research was partly sponsored by the National Research Foundation of Korea (NRF) grants funded by the Ministry of Science and ICT (No. 2020R1C1C1003455) and (No. 2019R1A2C2088812). Wonjun Lee is the corresponding author.
© 2022, American Mathematical Society. All rights reserved.
ASJC Scopus subject areas