Building a Distributed Key-Value Store in C++ Final
Table of Contents
👑 Phase 5: Leader Election with the Bully Algorithm#
Now that our key-value store supports multiple nodes and basic replication, we need a way to coordinate those nodes.
In this phase, we introduced:
- A simple leader election protocol
- Centralized write coordination
- Basic failure detection and automatic failover
We chose to implement the Bully algorithm as a stepping stone toward more advanced consensus models like Raft.
🧠 Why the Bully Algorithm?#
The Bully algorithm is a simple, ID-based election protocol. Here’s the high-level idea:
- Each node has a unique ID (we use the server’s port number).
- If a node suspects the leader is down, it sends an
ELECTION
message to all nodes with higher IDs. - If any respond with
OK
, it backs off and waits. - If none respond, it declares itself leader by broadcasting
COORDINATOR
.
This guarantees the node with the highest ID always becomes leader — hence the name “bully.”
🛠️ Election in Practice#
Each server now tracks:
- Its own ID (
node_id
) - The current
leader_id
- A flag
election_in_progress
🧵 Election on Startup#
On launch, a server waits a bit and checks if a leader has been elected. If not, it kicks off an election:
if (leader_id == -1 && !election_in_progress.exchange(true)) {
start_election();
}
⚔️ Starting an Election#
The start_election()
method sends ELECTION
messages to all nodes with higher port numbers.
If any respond with OK
, it assumes they’ll take over.
Otherwise, it calls declare_leader()
on itself:
void KVServer::start_election() {
for (peer : peers with higher ID) {
if response == OK:
higher_responded = true
}
if (!higher_responded) {
declare_leader(node_id);
}
}
📬 Handling Election Messages#
If a server receives an ELECTION
message:
- It replies with
OK
- If it has a higher ID, it starts its own election
else if (cmd == "ELECTION") {
if (node_id > sender_id && !election_in_progress.exchange(true)) {
start_election();
}
}
📢 Declaring the Leader#
If a node wins, it sets its ID as leader_id
and notifies everyone:
void KVServer::declare_leader(int new_leader_id) {
leader_id = new_leader_id;
forward_to_peers("COORDINATOR " + std::to_string(new_leader_id));
}
Receiving nodes update their internal state:
else if (cmd == "COORDINATOR") {
leader_id = new_leader_id;
election_in_progress = false;
}
❤️ Leader-Only Writes#
To avoid conflicting changes, only the leader is allowed to process client writes:
if (leader_id != node_id) {
response << "NOT_LEADER " << leader_id << "\n";
}
If the client contacts a non-leader, it’s redirected to the correct node and retries the command.
🫀 Leader Health Checks#
Each server pings the current leader every few seconds:
void KVServer::heartbeat_loop() {
while (true) {
sleep(3);
check_leader_health();
}
}
If the leader fails to respond to HEARTBEAT
, the node triggers a new election.
🤔 Why Not Raft (Yet)?#
Raft provides stronger guarantees than the Bully algorithm:
- Log replication and commit indices
- Safe leader changes
- Better handling of network partitions
So why didn’t we implement it?
Honest answer: Raft is complicated, and I’m still working to fully understand its inner workings.
Bully was much easier to implement and helped lay the groundwork for distributed coordination. Once I have a better grasp of Raft’s design (and logs, and voting, and terms…), I’ll consider migrating.
Until then, this simple system gives us:
- Leader-aware writes
- Client redirection
- Basic failover
Which is a great foundation.
📦 How It Fits Together#
The following flow now applies:
- A client sends a
PUT
to a server - If it’s not the leader, it replies with
NOT_LEADER <port>
- Client reconnects to the leader and retries
- Leader writes locally and forwards via
REPL_PUT
Everything still runs on plain TCP, no frameworks involved.
🗺️ Updated Roadmap#
Phase 1: Local Store
✅ Done
Basic In-Memory KV Store
- Put/Get/Delete support
Phase 2: Persistence
✅ Done
Durability with Append Log
- File-backed append log
- Replay mechanism
Phase 3: Networking
✅ Done
Client-Server via TCP
- Text-based protocol
- CLI client
Phase 4: Multi-Node Architecture
✅ Done
Clustered KV Store
- Replicate PUT/DEL to peers
- Internal commands over sockets
Phase 5: Consensus
✅ Done
Coordination & Failover
- Bully election
- Leader-only writes
- Client redirection
Phase 6: Testing & Resilience
❌ Cancelled
Fault Tolerance
Node crash recoveryRetry logic
🧳 Closing Words#
And with that, the project is complete.
What started as a simple in-memory key-value store grew into a distributed system with persistence, replication, and basic consensus. It’s been a challenging but deeply rewarding journey through sockets, logs, thread safety, and coordination algorithms.
While there’s always more to build (Raft, fault tolerance, async IO…), I’m drawing the line here for now. This project has taught me a ton about distributed systems, C++ architecture, and the complexities of building something real from scratch.
The code is available on my GitHub page.
Thanks for following along!
There are no articles to list here yet.