Kubernetes: Perform a task only in one instance of a multi-instance microservice deployment
During application development, it often happens that a routine of a program has to be executed only once, even in a multi-instance environment. Just a few examples for such functions are:
Initialization routines
Database Migrations
Scheduling cronjobs (tasks which are rerun according to a defined schedule, but should only be run in a single instance of the deployment)
Establishing a persistent connection to an external service and listen for events
Implementing this behavior in a single instance deployment is dead simple. The regular program flow will execute the necessary steps of initialization, database migrations, establishing connections and scheduling cron jobs.
The task at hand is much harder to solve, thinking about a multi-instance deployment environment like Kubernetes. Deploying multiple replicas of the same software results in all instances running the initialization routines, scheduling the same cronjobs and establishing a connection to an external service. The downsides of this are dependent on the executed routine itself, but are ranging from just a negative performance impact because of the unnecessary work which is done, to database inconsistencies or hitting rate limits of third-party integrations. Using multiple replicas, those functions now need to be executed conditionally, if, and only if, no other instance did not already run them. The execution needs to be synchronized across all replicas.
Therefore, the answer to the question: "How can I perform a task only in a single instance of a multi-instance microservice deployment?" involves developing and deploying at least another service, which keeps track of synchronization and is responsible for invoking the mentioned routines only in a single instance of the deployment.
If the application is running inside a Kubernetes Environment, this behavior can be achieved without the need of deploying another software to the infrastructure. Kubernetes itself is capable to handle the synchronization of the replicas.
Inside this article, I will present two different approaches how to utilize Kubernetes to synchronize replicas.
StatefulSets
The StatefulSet approach is the simpler and preferable approach for most applications, which require simple synchronization management. Using Kubernetes Stateful sets, each instance will get an ordinal index which persists over application restarts [ 1 ]. For example, defining a StatefulSet with the application name web the instances will be named web-0
, web-1
, web-2
, … The name of the instance can be injected inside the pod as an environment variable and the index can be extracted.
# The headless service needs to be defined!
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None # mark the service as headless
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
# INTERESTING PART
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
ports:
- containerPort: 80
name: web
The application can use the POD_NAME environment variable, to determine, if it is the leader.
package main
import (
"os"
"strconv"
"strings"
)
func main() {
podName := os.Getenv("POD_NAME")
if podName == "" {
// fallback if the environment variable is not set.
// maybe during development?
// always leader?
}
// not doing error checking here for brevity
index, _ := strconv.ParseInt(strings.Split("podName", "-")[1], 10, 64)
isLeader := index == 0
if isLeader {
// do something special
}
}
Advantages
Simple to set up
The execution flow is easy to follow
Disadvantages
Slower deployment process, since Kubernetes will wait for the previous replicate to be running before starting the next one.
There can only be a single leader, even if there are numerous tasks, which should be executed once but in parallel (Leader Election might not be the appropriate pattern here anyway. Message-Broker should be preferred for those tasks.)
Error handling, if the leader fails, has to be implemented manually or concepts like Health checks have to be utilized.
A headless service needs to be created
StatefulSets do not provide any guarantees on the termination of pods when a StatefulSet is deleted. To achieve ordered and graceful termination of the pods in the StatefulSet, it is possible to scale the StatefulSet down to 0 prior to deletion.
See more limitations inside the Kubernetes Documentation about StatefulSets.
Leader Election
Leader election describes the process of selecting a single instance of an application as the leader inside a multi-instance deployment environment.
Coordinate the actions performed by a collection of collaborating instances in a distributed application by electing one instance as the leader that assumes responsibility for managing the others. This can help to ensure that instances don’t conflict with each other, cause contention for shared resources, or inadvertently interfere with the work that other instances are performing.
- Microsoft Azure Documentation (https://docs.microsoft.com/en-us/azure/architecture/patterns/leader-election)
The Kubernetes API provides a simple leader election based upon a distributed lock resource. All replicas are candidates inside a race to become the leader, by attempting to mark a distributed resource with their identifier. Kubernetes itself guarantees, that only a single instance will win this race and becomes the leader. Once the election finished, the leader periodically pings the Kubernetes API to renew its leader position. All other candidates are periodically endeavoring to lock the resource themselves. This ensures, that, if the leader fails, a new leader is elected quickly [2].
Advantages
The programming pattern is not limited to Kubernetes, but can be used in all environments. Nevertheless, the provided sample implementation uses the Kubernetes API and does not easily translate to other environments.
The approach is more error resistant since a failover is built in.
Multiple locks can be defined. There can be multiple leader instances which are responsible for a subset of tasks.
Disadvantages
The used state variable has to be defined using reactive programming patterns like Event-Emitters, which ensures that the program reacts to any changes. This increases the complexity of the program.
Increases the workload on the Kubernetes controller.
Implementation in Golang
The implementation of a leader election in Golang with Kubernetes can be done using the k8s.io
packages. It consists of the LeaderElection
struct (Event-Emitter) and multiple listeners. This section includes example source code, showing how to implement a leader election using the Kubernetes API.
package leaderelection
import (
"context"
"os"
"time"
"log"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/rest"
"k8s.io/client-go/tools/leaderelection"
"k8s.io/client-go/tools/leaderelection/resourcelock"
)
type KubernetesLeaderElection struct {
lockName string
lockNamespace string
podName string
isLeader bool
ctx context.Context
client *kubernetes.Clientset
listeners []Listener
}
type Listener interface {
// should be idempotent
Start()
// should be idempotent
Stop()
IsRunning() bool
}
// podName: provide the name of the pod or any random string. It is just necessary, that the podName variable differs between all
// deployed instances.
// lockName: Any string. Inside a program multiple leaders can be defined using a different lock name for each.
// nevertheless - in most situations a single leader is enough.
// namespace: Name of the namespace the pod is running in. The used service account needs to have access to this ressource.
// see rbac.yaml
func NewKubernetesLeaderElection(ctx context.Context, podName, lockName, namespace string) *KubernetesLeaderElection {
config, err := rest.InClusterConfig()
if err != nil {
log.Fatal(err)
}
client := kubernetes.NewForConfigOrDie(config)
return &KubernetesLeaderElection{
lockName: lockName,
lockNamespace: lockNamespace,
ctx: ctx,
client: client,
podName: podName,
}
}
// start the election. This method will block. Start it on a goroutine.
func (k *KubernetesLeaderElection) RunElection() {
log.Info("running leader election")
lock := k.getNewLock()
leaderelection.RunOrDie(k.ctx, leaderelection.LeaderElectionConfig{
Lock: lock,
ReleaseOnCancel: true,
LeaseDuration: 15 * time.Second,
RenewDeadline: 10 * time.Second,
RetryPeriod: 2 * time.Second,
Callbacks: leaderelection.LeaderCallbacks{
OnStartedLeading: func(c context.Context) {
log.Info("elected as the leader")
k.isLeader = true
// call start on each listener.
for i, lst := range k.listeners {
log.Info("starting listener: ", i)
// each listener will run in its own goroutine
go lst.Start()
}
},
OnNewLeader: func(identity string) {
if k.podName != identity {
log.Info("new leader: ", identity)
}
},
OnStoppedLeading: func() {
log.Info("stopped leading")
k.isLeader = false
// call stop on each listener.
for i, lst := range k.listeners {
log.Info("stopping listener: ", i)
lst.Stop()
}
},
},
})
}
func (k *KubernetesLeaderElection) IsLeader() bool {
return k.isLeader
}
// add a new listener to the election object.
// the listener will get notified when the current pod becomes the leader or stops being the leader.
func (k *KubernetesLeaderElection) AddListener(listener Listener) {
k.listeners = append(k.listeners, listener)
if k.isLeader {
go listener.Start()
}
}
func (k *KubernetesLeaderElection) getNewLock() *resourcelock.LeaseLock {
return &resourcelock.LeaseLock{
LeaseMeta: metav1.ObjectMeta{
Name: k.lockName,
Namespace: k.lockNamespace,
},
Client: k.client.CoordinationV1(),
LockConfig: resourcelock.ResourceLockConfig{
Identity: k.podName,
},
}
}
A new leader election object gets created using the variables:
podName
: provide the name of the pod or any random string. It is just necessary, that the podName variable differs between all deployed instances.lockName
: The name of the lock, the program is attempting to acquire leadership over.namespace
: The namespace the pod is running inside.
The listener is a wrapper struct around a single function, which should only get executed by the leader. A simple Listener implementation might look like this. Its purpose is, to encapsulate the isRunning
state. This simplifies the process of defining idempotent functions.
package leaderelection
type SimpleListener struct {
do func(cancelChan <-chan struct{})
isRunning bool
cancelChan chan struct{}
}
func NewListener(do func(cancelChan <-chan struct{})) Listener {
return &SimpleListener{
do: do,
isRunning: false,
cancelChan: make(chan struct{}),
}
}
func (s *SimpleListener) Start() {
if !s.isRunning {
s.isRunning = true
s.do(s.cancelChan)
}
}
func (s *SimpleListener) Stop() {
if s.isRunning {
s.isRunning = false
s.cancelChan <- struct{}{}
}
}
Finally, the usage of the defined structs and methods is as simple as writing a single function, which listens to the cancel channel and executes the program logic.
package main
import (
// add your imports
// leaderelection package gets imported here
)
func main() {
podName := os.Getenv("POD_NAME") // gets defined in deployment.yaml
namespace := os.Getenv("NAMESPACE") // gets defined in deployment.yaml
leaderElection := leaderelection.NewKubernetesLeaderElection(context.Background(), podName, "leaderelection", namespace)
listener := leaderelection.NewListener(func(cancelChan <-chan struct{]) {
// init cronjob.
<-cancelChan // block till leader changes. - might be used inside select statement if listening on two channels
// run any cleanups, like quitting the websocket connection or canceling the cronjob.
})
}
The last step is the deployment of the application. Since the application uses Kubernetes resources to handle the leader election process, the pod requires a new set of permissions. Kubernetes does use Role-Based-Access-Control, therefore, a new role and a new service account need to be created, which gives access to the leases
resource.
apiVersion: apps/v1
kind: Deployment
metadata:
name: your-name
namespace: your-namespace
spec:
replicas: 3
selector:
matchLabels:
app: your-app
template:
metadata:
labels:
app: your-app
spec:
automountServiceAccountToken: true
serviceAccount: leaderelection-sa
containers:
- name: your-app
image: your-container-image
# INTERESTING PART
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
---
apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
name: leaderelection-sa
namespace: your-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: leaderelection-role
namespace: your-namespace
rules:
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- '*'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: leaderelection-rolebinding
namespace: your-namespace
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: leaderelection-role
subjects:
- kind: ServiceAccount
name: leaderelection-sa
---
Conclusion
Kubernetes does provide StatefulSets, which makes it easy to implement tasks, that should only be executed inside a single instance of a multi-instance deployment. This approach is sufficient for a big range of applications and use cases. If a more error resistant approach is needed and a faster deployment, the leader election pattern might be more suitable. Some Kubernetes knowledge is required, to create and set up the different synchronization approaches.
Resources
[1] https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
[2] https://kubernetes.io/blog/2016/01/simple-leader-election-with-kubernetes/