Skip to main content

Building a Distributed Key-Value Store in C++ (Part 2)

Table of Contents

💾 Persistence: Never Forget
#

In Part 1, we built a simple in-memory key-value store. But if you kill the process, your data’s gone. That’s not acceptable in any serious system.

So in this part, we add persistence using an append-only log to record all changes to disk.

This means:

  • Every put and del operation is logged
  • On startup, we replay the log to recover the in-memory state

This ensures durability and crash recovery, our KV store now “remembers”.

🧱 Append-Only Log
#

Here’s how it works:

  1. Every put(key, value) writes a line like PUT key value to a log file.
  2. Every del(key) writes DEL key. 3. On startup, we replay the log file line-by-line to rebuild the in-memory store.

🔍 Code Overview
#

// In kvstore.cpp

void KVStore::append_log(const std::string& op, const std::string& key, const std::string& value) {
  if (log_out.is_open()) {
    log_out << op << " " << key;
    if (op == "PUT") {
      log_out << " " << value;
    }
    log_out << "\n";
    log_out.flush();
  }
}

A subtle but important thing here is flushing after every operation. This minimizes the chance of data loss if the process crashes.

Recovery on Startup
#

void KVStore::load_from_log(){
  std::ifstream in(log_file_path);
  std::string line;

  while (std::getline(in, line)) {
    std::istringstream iss(line);
    std::string op, key, value;
    iss >> op >> key;
    if (op == "PUT") {
      iss >> value;
      store[key] = value;
    } else if (op == "DEL") {
      store.erase(key);
    }
  }
}

This function replays all logged operations to reconstruct the state on boot.

🧪 Testing with Catch2
#

With persistence in place, it’s time to get serious about testing.

We’re using Catch2, a lightweight and expressive C++ testing framework. Thanks to CMake’s FetchContent, it’s easy to integrate:

# in CMakeLists.txt
FetchContent_Declare(
  Catch2
  GIT_REPOSITORY https://github.com/catchorg/Catch2.git
  GIT_TAG        v3.5.2
)
FetchContent_MakeAvailable(Catch2)

enable_testing()
add_subdirectory(tests)
# in tests/CMakeLists.txt
add_executable(test_kvstore test_kvstore.cpp)

target_include_directories(test_kvstore
  PRIVATE
  ${PROJECT_SOURCE_DIR}/src
)

target_link_libraries(test_kvstore
  PRIVATE
    kvstore_lib
    Catch2::Catch2WithMain
)

include(CTest)
# include(Catch)
# catch_discover_tests(test_kvstore)

🧪 Test Example
#

// tests/test_kvstore.cpp
TEST_CASE("Basic KVStore operations") {
  KVStore store("test_store.log");

  store.put("a", "1");
  REQUIRE(store.get("a").value() == "1");

  store.put("b", "2");
  REQUIRE(store.get("b") == std::make_optional(std::string("2")));

  REQUIRE(store.del("a") == true);
  REQUIRE_FALSE(store.get("a").has_value());
}

Simple, readable, and effective.

Note: Always use a dedicated test log file to avoid polluting real data. You might even delete the test log in your test fixture for a clean state.

🗺️ Updated Roadmap
#

  1. Phase 1: Local Store

    ✅ Done

    Basic In-Memory KV Store

    • Implemented a KVStore class
    • Supports put, get, and del operations
    • Command-line usage for demoing
  2. Phase 2: Persistence

    ✅ Done

    Durability with Append-Only Log

    • Append-only log on disk
    • Recovery by replaying log
    • Unit tests with Catch2
  3. Phase 3: Networking

    Next

    Client-Server Communication

    • Expose KVStore via TCP sockets
    • Define simple request/response protocol
    • Build interactive CLI tool
  4. Phase 4: Multi-Node Architecture

    Planned

    Cluster Mode

  5. Phase 5: Consensus

    Planned

    Leader Election and Coordination

  6. Phase 6: Testing & Resilience

    Planned

    Hardening the System

📌 What’s Next?
#

Now that we can persist data across restarts, we’re ready to make the leap from single-process to client-server architecture. That means:

  • Opening a TCP port
  • Receiving requests
  • Sending back responses

In the next post, I’ll build the first version of a networked KV store, setting the stage for a truly distributed system.

There are no articles to list here yet.