Skip to main content

Building a Distributed Key-Value Store in C++ Final

Table of Contents

👑 Phase 5: Leader Election with the Bully Algorithm
#

Now that our key-value store supports multiple nodes and basic replication, we need a way to coordinate those nodes.

In this phase, we introduced:

  • A simple leader election protocol
  • Centralized write coordination
  • Basic failure detection and automatic failover

We chose to implement the Bully algorithm as a stepping stone toward more advanced consensus models like Raft.

🧠 Why the Bully Algorithm?
#

The Bully algorithm is a simple, ID-based election protocol. Here’s the high-level idea:

  1. Each node has a unique ID (we use the server’s port number).
  2. If a node suspects the leader is down, it sends an ELECTION message to all nodes with higher IDs.
  3. If any respond with OK, it backs off and waits.
  4. If none respond, it declares itself leader by broadcasting COORDINATOR.

This guarantees the node with the highest ID always becomes leader — hence the name “bully.”

🛠️ Election in Practice
#

Each server now tracks:

  • Its own ID (node_id)
  • The current leader_id
  • A flag election_in_progress

🧵 Election on Startup
#

On launch, a server waits a bit and checks if a leader has been elected. If not, it kicks off an election:

if (leader_id == -1 && !election_in_progress.exchange(true)) {
    start_election();
}

⚔️ Starting an Election
#

The start_election() method sends ELECTION messages to all nodes with higher port numbers.

If any respond with OK, it assumes they’ll take over.

Otherwise, it calls declare_leader() on itself:

void KVServer::start_election() {
  for (peer : peers with higher ID) {
    if response == OK:
      higher_responded = true
  }

  if (!higher_responded) {
    declare_leader(node_id);
  }
}

📬 Handling Election Messages
#

If a server receives an ELECTION message:

  • It replies with OK
  • If it has a higher ID, it starts its own election
else if (cmd == "ELECTION") {
  if (node_id > sender_id && !election_in_progress.exchange(true)) {
    start_election();
  }
}

📢 Declaring the Leader
#

If a node wins, it sets its ID as leader_id and notifies everyone:

void KVServer::declare_leader(int new_leader_id) {
  leader_id = new_leader_id;
  forward_to_peers("COORDINATOR " + std::to_string(new_leader_id));
}

Receiving nodes update their internal state:

else if (cmd == "COORDINATOR") {
  leader_id = new_leader_id;
  election_in_progress = false;
}

❤️ Leader-Only Writes
#

To avoid conflicting changes, only the leader is allowed to process client writes:

if (leader_id != node_id) {
  response << "NOT_LEADER " << leader_id << "\n";
}

If the client contacts a non-leader, it’s redirected to the correct node and retries the command.

🫀 Leader Health Checks
#

Each server pings the current leader every few seconds:

void KVServer::heartbeat_loop() {
  while (true) {
    sleep(3);
    check_leader_health();
  }
}

If the leader fails to respond to HEARTBEAT, the node triggers a new election.


🤔 Why Not Raft (Yet)?
#

Raft provides stronger guarantees than the Bully algorithm:

  • Log replication and commit indices
  • Safe leader changes
  • Better handling of network partitions

So why didn’t we implement it?

Honest answer: Raft is complicated, and I’m still working to fully understand its inner workings.

Bully was much easier to implement and helped lay the groundwork for distributed coordination. Once I have a better grasp of Raft’s design (and logs, and voting, and terms…), I’ll consider migrating.

Until then, this simple system gives us:

  • Leader-aware writes
  • Client redirection
  • Basic failover

Which is a great foundation.

📦 How It Fits Together
#

The following flow now applies:

  1. A client sends a PUT to a server
  2. If it’s not the leader, it replies with NOT_LEADER <port>
  3. Client reconnects to the leader and retries
  4. Leader writes locally and forwards via REPL_PUT

Everything still runs on plain TCP, no frameworks involved.

🗺️ Updated Roadmap
#

  1. Phase 1: Local Store

    ✅ Done

    Basic In-Memory KV Store

    • Put/Get/Delete support
  2. Phase 2: Persistence

    ✅ Done

    Durability with Append Log

    • File-backed append log
    • Replay mechanism
  3. Phase 3: Networking

    ✅ Done

    Client-Server via TCP

    • Text-based protocol
    • CLI client
  4. Phase 4: Multi-Node Architecture

    ✅ Done

    Clustered KV Store

    • Replicate PUT/DEL to peers
    • Internal commands over sockets
  5. Phase 5: Consensus

    ✅ Done

    Coordination & Failover

    • Bully election
    • Leader-only writes
    • Client redirection
  6. Phase 6: Testing & Resilience

    ❌ Cancelled

    Fault Tolerance

    • Node crash recovery
    • Retry logic

🧳 Closing Words
#

And with that, the project is complete.

What started as a simple in-memory key-value store grew into a distributed system with persistence, replication, and basic consensus. It’s been a challenging but deeply rewarding journey through sockets, logs, thread safety, and coordination algorithms.

While there’s always more to build (Raft, fault tolerance, async IO…), I’m drawing the line here for now. This project has taught me a ton about distributed systems, C++ architecture, and the complexities of building something real from scratch.

The code is available on my GitHub page.

Thanks for following along!

There are no articles to list here yet.