How to perform distributed locking using etcd?

According to the docs, etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. Etcd stores data in the form of key-value pairs under prefixes.

Quite mouthful right? What matters to us is that etcd is a database whose contents are (ideally) stored in multiple places (nodes) and an etcd cluster can be used as a central place for process coordination in a distributed system. Kubernetes uses etcd to carry out its vital processes! So we know etcd is no joke. And you too can use etcd in your distributed system to implement distributed locking.

What is distributed locking?

When running multiple replicas of a microservice you would some part of the code to be executed by only one thread in only one of the replicas of that particular microservice. Such an implementation, my friend, is called distributed locking.

In this blog, we are going to use an etcd client written in golang to write a go program that can easily become a part of your microservice to implement distributed locking.

This is how our program looks, I will shortly explain what it does.

package main

import (
    "context"
    "fmt"
    "time"

    "log"

    client "go.etcd.io/etcd/client/v3"
    "go.etcd.io/etcd/client/v3/concurrency"
)

var lock *concurrency.Mutex

func main() {
    // initialise the etcd client
    etcdClient, err := client.New(client.Config{Endpoints: []string{"localhost:2379"}})
    if err != nil {
      log.Fatal(err)
    }
    defer etcdClient.Close() // cleanup

    session, err := concurrency.NewSession(etcdClient)
    if err != nil {
        log.Fatal(err)
    }
    defer session.Close()
    mutex := concurrency.NewMutex(session, "/lock-prefix")

    PerformImportantTask(mutex)
}

// super important func of your microservice that can be called externally
func PerformImportantTask(mutex *concurrency.Mutex) {
    ctx := context.Background()
    t := time.Now().Unix()

    fmt.Println("Waiting to acquire lock...")
    // only one process can lock, others will wait
    mutex.Lock(ctx)
    fmt.Printf("Acquired lock after waiting for %d seconds\n", time.Now().Unix()-t)

    // perform very important critical section task
    fmt.Println("Performing super important task")
    time.Sleep(5 * time.Second) // mock

    mutex.Unlock(ctx)
    fmt.Println("Done!")
}

Install etcd. And run it locally by running the command etcd in a terminal.

If we run this program in two separate terminals using go run main.go, we can see one instance locks immediately as nobody has held the lock. Whereas the other instance is waiting for the lock to be unlocked so that it can lock and perform the critical task. The first instance performs the critical task unlocks and exits. As soon as that happens, the second instance is allowed to lock and then it executes the critical code, waits for 5 seconds, unlocks, and the program ends.

Now let's study the code.

First, we initialise the etcd client using the default endpoints for the etcd cluster running on our local system.

Then we create an election session using concurrency.NewSession(). This creates a lease object in the etcd cluster. Calling mutex.Lock() creates a key-value pair with the lease created by concurrency.NewSession(), which denotes the ownership of the lock by an instance.

If you want to cancel the lock, you can use the context ctx passed into mutex.Lock().

So this is how you can implement distributed locking using etcd. There are a couple more API methods for distributed locking. You can find the link below.

github.com/etcd-io/etcd/blob/main/client/v3..

Shameless plug: You can also implement leader election using the etcd client. Read more here.

Distributed locking using Etcd