TY - JOUR
T1 - In-Network Leaderless Replication for Distributed Data Stores
AU - Kim, Gyuyeong
AU - Lee, Wonjun
N1 - Funding Information:
We would like to thank the anonymous reviewers for providing insightful comments. This research was partly sponsored by the National Research Foundation of Korea (NRF) grants funded by the Ministry of Science and ICT (No. 2020R1C1C1003455) and (No. 2019R1A2C2088812). Wonjun Lee is the corresponding author.
Publisher Copyright:
© 2022, American Mathematical Society. All rights reserved.
PY - 2022
Y1 - 2022
N2 - Leaderless replication allows any replica to handle any type of request to achieve read scalability and high availability for distributed data stores. However, this entails burdensome coordination overhead of replication protocols, degrading write throughput. In addition, the data store still requires coordination for membership changes, making it hard to resolve server failures quickly. To this end, we present NetLR, a replicated data store architecture that supports high performance, fault tolerance, and linearizability simultaneously. The key idea of NetLR is moving the entire replication functions into the network by leveraging the switch as an on-path in-network replication orchestrator. Specifically, NetLR performs consistency-aware read scheduling, high-performance write coordination, and active fault adaptation in the network switch. Our in-network replication eliminates inter-replica coordination for writes and membership changes, providing high write performance and fast failure handling. NetLR can be implemented using programmable switches at a line rate with only 5.68% of additional memory usage. We implement a prototype of NetLR on an Intel Tofino switch and conduct extensive testbed experiments. Our evaluation results show that NetLR is the only solution that achieves high throughput and low latency and is robust to server failures.
AB - Leaderless replication allows any replica to handle any type of request to achieve read scalability and high availability for distributed data stores. However, this entails burdensome coordination overhead of replication protocols, degrading write throughput. In addition, the data store still requires coordination for membership changes, making it hard to resolve server failures quickly. To this end, we present NetLR, a replicated data store architecture that supports high performance, fault tolerance, and linearizability simultaneously. The key idea of NetLR is moving the entire replication functions into the network by leveraging the switch as an on-path in-network replication orchestrator. Specifically, NetLR performs consistency-aware read scheduling, high-performance write coordination, and active fault adaptation in the network switch. Our in-network replication eliminates inter-replica coordination for writes and membership changes, providing high write performance and fast failure handling. NetLR can be implemented using programmable switches at a line rate with only 5.68% of additional memory usage. We implement a prototype of NetLR on an Intel Tofino switch and conduct extensive testbed experiments. Our evaluation results show that NetLR is the only solution that achieves high throughput and low latency and is robust to server failures.
UR - http://www.scopus.com/inward/record.url?scp=85142531797&partnerID=8YFLogxK
U2 - 10.14778/3523210.3523213
DO - 10.14778/3523210.3523213
M3 - Conference article
AN - SCOPUS:85142531797
SN - 0271-4132
VL - 15
SP - 1337
EP - 1349
JO - Contemporary Mathematics
JF - Contemporary Mathematics
IS - 7
T2 - 48th International Conference on Very Large Data Bases, VLDB 2022
Y2 - 5 September 2022 through 9 September 2022
ER -