Kubernetes: Perform a task only in one instance of a multi-instance microservice deployment

Bonn, 28.2.20231 Minuten Lesezeit

During application development, it often happens that a routine of a program has to be executed only once, even in a multi-instance environment. Just a few examples for such functions are:

Initialization routines
Database Migrations
Scheduling cronjobs (tasks which are rerun according to a defined schedule, but should only be run in a single instance of the deployment)
Establishing a persistent connection to an external service and listen for events

Implementing this behavior in a single instance deployment is dead simple. The regular program flow will execute the necessary steps of initialization, database migrations, establishing connections and scheduling cron jobs.

The task at hand is much harder to solve, thinking about a multi-instance deployment environment like Kubernetes. Deploying multiple replicas of the same software results in all instances running the initialization routines, scheduling the same cronjobs and establishing a connection to an external service. The downsides of this are dependent on the executed routine itself, but are ranging from just a negative performance impact because of the unnecessary work which is done, to database inconsistencies or hitting rate limits of third-party integrations. Using multiple replicas, those functions now need to be executed conditionally, if, and only if, no other instance did not already run them. The execution needs to be synchronized across all replicas.

Therefore, the answer to the question: "How can I perform a task only in a single instance of a multi-instance microservice deployment?" involves developing and deploying at least another service, which keeps track of synchronization and is responsible for invoking the mentioned routines only in a single instance of the deployment.

If the application is running inside a Kubernetes Environment, this behavior can be achieved without the need of deploying another software to the infrastructure. Kubernetes itself is capable to handle the synchronization of the replicas.

Inside this article, I will present two different approaches how to utilize Kubernetes to synchronize replicas.

StatefulSets

The StatefulSet approach is the simpler and preferable approach for most applications, which require simple synchronization management. Using Kubernetes Stateful sets, each instance will get an ordinal index which persists over application restarts [ 1 ]. For example, defining a StatefulSet with the application name web the instances will be named web-0, web-1, web-2, … The name of the instance can be injected inside the pod as an environment variable and the index can be extracted.

# The headless service needs to be defined!
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None # mark the service as headless
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: nginx # has to match .spec.template.metadata.labels
  serviceName: "nginx"
  replicas: 3 # by default is 1
  template:
    metadata:
      labels:
        app: nginx # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        # INTERESTING PART
        env:
          - name: POD_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.name
        ports:
        - containerPort: 80
          name: web

The application can use the POD_NAME environment variable, to determine, if it is the leader.

package main

import (
	"os"
	"strconv"
	"strings"
)

func main() {
	podName := os.Getenv("POD_NAME")
	if podName == "" {
		// fallback if the environment variable is not set.
		// maybe during development?
		// always leader?
	}
	// not doing error checking here for brevity
	index, _ := strconv.ParseInt(strings.Split("podName", "-")[1], 10, 64)

	isLeader := index == 0
  if isLeader {
    // do something special
  }
}

Advantages

Simple to set up
The execution flow is easy to follow

Disadvantages

Slower deployment process, since Kubernetes will wait for the previous replicate to be running before starting the next one.
There can only be a single leader, even if there are numerous tasks, which should be executed once but in parallel (Leader Election might not be the appropriate pattern here anyway. Message-Broker should be preferred for those tasks.)
Error handling, if the leader fails, has to be implemented manually or concepts like Health checks have to be utilized.
A headless service needs to be created
StatefulSets do not provide any guarantees on the termination of pods when a StatefulSet is deleted. To achieve ordered and graceful termination of the pods in the StatefulSet, it is possible to scale the StatefulSet down to 0 prior to deletion.

See more limitations inside the Kubernetes Documentation about StatefulSets.

Leader Election

Leader election describes the process of selecting a single instance of an application as the leader inside a multi-instance deployment environment.

Coordinate the actions performed by a collection of collaborating instances in a distributed application by electing one instance as the leader that assumes responsibility for managing the others. This can help to ensure that instances don’t conflict with each other, cause contention for shared resources, or inadvertently interfere with the work that other instances are performing.
- Microsoft Azure Documentation (https://docs.microsoft.com/en-us/azure/architecture/patterns/leader-election)

The Kubernetes API provides a simple leader election based upon a distributed lock resource. All replicas are candidates inside a race to become the leader, by attempting to mark a distributed resource with their identifier. Kubernetes itself guarantees, that only a single instance will win this race and becomes the leader. Once the election finished, the leader periodically pings the Kubernetes API to renew its leader position. All other candidates are periodically endeavoring to lock the resource themselves. This ensures, that, if the leader fails, a new leader is elected quickly [2].

Advantages

The programming pattern is not limited to Kubernetes, but can be used in all environments. Nevertheless, the provided sample implementation uses the Kubernetes API and does not easily translate to other environments.
The approach is more error resistant since a failover is built in.
Multiple locks can be defined. There can be multiple leader instances which are responsible for a subset of tasks.

Disadvantages

The used state variable has to be defined using reactive programming patterns like Event-Emitters, which ensures that the program reacts to any changes. This increases the complexity of the program.
Increases the workload on the Kubernetes controller.

Implementation in Golang

The implementation of a leader election in Golang with Kubernetes can be done using the k8s.io packages. It consists of the LeaderElection struct (Event-Emitter) and multiple listeners. This section includes example source code, showing how to implement a leader election using the Kubernetes API.

package leaderelection

import (
	"context"
	"os"
	"time"
	"log"
	
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/client-go/kubernetes"
	"k8s.io/client-go/rest"
	"k8s.io/client-go/tools/leaderelection"
	"k8s.io/client-go/tools/leaderelection/resourcelock"
)

type KubernetesLeaderElection struct {
	lockName      string
	lockNamespace string
	podName       string
	isLeader      bool
	ctx           context.Context
	client        *kubernetes.Clientset
	listeners     []Listener
}

type Listener interface {
	// should be idempotent
	Start()
	// should be idempotent
	Stop()
	IsRunning() bool
}

// podName: provide the name of the pod or any random string. It is just necessary, that the podName variable differs between all
// deployed instances.

// lockName: Any string. Inside a program multiple leaders can be defined using a different lock name for each.
// nevertheless - in most situations a single leader is enough.

// namespace: Name of the namespace the pod is running in. The used service account needs to have access to this ressource.
// see rbac.yaml
func NewKubernetesLeaderElection(ctx context.Context, podName, lockName, namespace string) *KubernetesLeaderElection {
	config, err := rest.InClusterConfig()
	if err != nil {
		log.Fatal(err)
	}

	client := kubernetes.NewForConfigOrDie(config)
	return &KubernetesLeaderElection{
		lockName:      lockName,
		lockNamespace: lockNamespace,
		ctx:           ctx,
		client:        client,
		podName:       podName,
	}
}

// start the election. This method will block. Start it on a goroutine.
func (k *KubernetesLeaderElection) RunElection() {
	log.Info("running leader election")
	lock := k.getNewLock()
	leaderelection.RunOrDie(k.ctx, leaderelection.LeaderElectionConfig{
		Lock:            lock,
		ReleaseOnCancel: true,
		LeaseDuration:   15 * time.Second,
		RenewDeadline:   10 * time.Second,
		RetryPeriod:     2 * time.Second,
		Callbacks: leaderelection.LeaderCallbacks{
			OnStartedLeading: func(c context.Context) {
				log.Info("elected as the leader")
				k.isLeader = true
				// call start on each listener.
				for i, lst := range k.listeners {
					log.Info("starting listener: ", i)
					// each listener will run in its own goroutine
					go lst.Start()
				}
			},
			OnNewLeader: func(identity string) {
				if k.podName != identity {
					log.Info("new leader: ", identity)
				}
			},
			OnStoppedLeading: func() {
				log.Info("stopped leading")
				k.isLeader = false
				// call stop on each listener.
				for i, lst := range k.listeners {
					log.Info("stopping listener: ", i)
					lst.Stop()
				}
			},
		},
	})
}

func (k *KubernetesLeaderElection) IsLeader() bool {
	return k.isLeader
}

// add a new listener to the election object.
// the listener will get notified when the current pod becomes the leader or stops being the leader.
func (k *KubernetesLeaderElection) AddListener(listener Listener) {
	k.listeners = append(k.listeners, listener)
	if k.isLeader {
		go listener.Start()
	}
}

func (k *KubernetesLeaderElection) getNewLock() *resourcelock.LeaseLock {
	return &resourcelock.LeaseLock{
		LeaseMeta: metav1.ObjectMeta{
			Name:      k.lockName,
			Namespace: k.lockNamespace,
		},
		Client: k.client.CoordinationV1(),
		LockConfig: resourcelock.ResourceLockConfig{
			Identity: k.podName,
		},
	}
}

A new leader election object gets created using the variables:

podName : provide the name of the pod or any random string. It is just necessary, that the podName variable differs between all deployed instances.
lockName : The name of the lock, the program is attempting to acquire leadership over.
namespace : The namespace the pod is running inside.

The listener is a wrapper struct around a single function, which should only get executed by the leader. A simple Listener implementation might look like this. Its purpose is, to encapsulate the isRunning state. This simplifies the process of defining idempotent functions.

package leaderelection

type SimpleListener struct {
	do         func(cancelChan <-chan struct{})
	isRunning  bool
	cancelChan chan struct{}
}

func NewListener(do func(cancelChan <-chan struct{})) Listener {
	return &SimpleListener{
		do:         do,
		isRunning:  false,
		cancelChan: make(chan struct{}),
	}
}

func (s *SimpleListener) Start() {
	if !s.isRunning {
		s.isRunning = true
		s.do(s.cancelChan)
	}
}

func (s *SimpleListener) Stop() {
	if s.isRunning {
		s.isRunning = false
		s.cancelChan <- struct{}{}
	}
}

Finally, the usage of the defined structs and methods is as simple as writing a single function, which listens to the cancel channel and executes the program logic.

package main

import (
  // add your imports
  
  // leaderelection package gets imported here
)

func main() {
  podName := os.Getenv("POD_NAME") // gets defined in deployment.yaml
  namespace := os.Getenv("NAMESPACE") // gets defined in deployment.yaml
  leaderElection := leaderelection.NewKubernetesLeaderElection(context.Background(), podName, "leaderelection", namespace)
  
  listener := leaderelection.NewListener(func(cancelChan <-chan struct{]) {
    // init cronjob.
    
    <-cancelChan // block till leader changes. - might be used inside select statement if listening on two channels
    
    // run any cleanups, like quitting the websocket connection or canceling the cronjob.
  })
}

The last step is the deployment of the application. Since the application uses Kubernetes resources to handle the leader election process, the pod requires a new set of permissions. Kubernetes does use Role-Based-Access-Control, therefore, a new role and a new service account need to be created, which gives access to the leases resource.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: your-name
  namespace: your-namespace
spec:
  replicas: 3
  selector:
    matchLabels:
      app: your-app
  template:
    metadata:
      labels:
        app: your-app
    spec:
      automountServiceAccountToken: true
      serviceAccount: leaderelection-sa
      containers:
        - name: your-app
          image: your-container-image
          # INTERESTING PART
          env:
            - name: POD_NAME
              valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: metadata.name
            - name: NAMESPACE
              valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: metadata.namespace
                     
---
apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
  name: leaderelection-sa
  namespace: your-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: leaderelection-role
  namespace: your-namespace
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs:
      - '*'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: leaderelection-rolebinding
  namespace: your-namespace
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: leaderelection-role
subjects:
  - kind: ServiceAccount
    name: leaderelection-sa
---

Conclusion

Kubernetes does provide StatefulSets, which makes it easy to implement tasks, that should only be executed inside a single instance of a multi-instance deployment. This approach is sufficient for a big range of applications and use cases. If a more error resistant approach is needed and a faster deployment, the leader election pattern might be more suitable. Some Kubernetes knowledge is required, to create and set up the different synchronization approaches.

Resources

[1] https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

[2] https://kubernetes.io/blog/2016/01/simple-leader-election-with-kubernetes/