Kubernetes Informers: Introduction and Deep Dive

By Mario Macías Lloret, Farhan · Original source ↻

en

This article shows you the tool that the Kubernetes Go client library provides to keep an updated in-memory snapshot of your cluster resources.

In the code examples, we use the following package aliases:

import (
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    appsv1 "k8s.io/api/apps/v1"
    corev1 "k8s.io/api/core/v1"
)

Motivation

If your Go program requires fetching information about K8s resources (Services, ReplicaSets, Pods...), you can use the Kubernetes REST API with the official K8s Go client instance:

// gets the information of a given pod in the default namespace
pod, err :=  client.CoreV1().Pods("default").
    Get(context.Background(), "pod-name", v1.GetOptions{})

// gets the information of all the currently existing pods in all the
// namespaces
pods, err := client.CoreV1().Pods(corev1.NamespaceAll).
    List(context.Background(), v1.ListOptions{})

(Client configuration/instantiation details are omitted for the sake of brevity).

However, you might want to minimize the number of connections as well as the latency of fetching the resources' data, so you might want to use the Watch interface to listen for changes in the Kubernetes resources and keep an in-memory copy of the resources:

// ignoring returned error on purpose
watcher, _ := client.CoreV1().Pods(corev1.NamespaceAll).
	Watch(context.Background(), metav1.ListOptions{})
for event := range watcher.ResultChan() {
    pod := event.Object.(*corev1.Pod)
    fmt.Printf("%v pod with name %s\n", event.Type, pod.Name)
}

The above code would print something similar to this:

ADDED pod with name openshift-controller-manager-operator-6b4884d944-gbj2n
ADDED pod with name installer-3-ip-10-0-131-5.ec2.internal
ADDED pod with name kube-storage-version-migrator-operator-684c8fbd9-fw6p8
ADDED pod with name apiserver-86b697ffcb-424gl
ADDED pod with name kube-controller-manager-ip-10-0-131-5.ec2.internal
ADDED pod with name cluster-autoscaler-operator-558c76fc6-l4xwk
ADDED pod with name node-exporter-wjks6

To keep an in-memory copy of all the pods in your cluster, you would need to check the event.Type value (Added, Deleted, Modified...) and then accordingly update a map that stores the pods data, indexed by a unique field (e.g. pod namespace+name).

You would need to repeat again an again the same code for all the resources you might want to keep in memory, including:

  • Managing the in-memory storage and the indexing, as well as the concurrent access to it, if needed.
  • Establishing the connection to watch the different resources, as well as managing reconnections.

Informers to the rescue

To minimize boilerplate and repetitive code, the Kubernetes Go library provides entities named Informers, which continously watch for your Kubernetes resources updates (additions, deletions, modifications...) and keep an in-memory copy of them, which can be retrieved by a given index.

You can create informers for each resource type by means of an informer factory. For example, the following code would create a pods' informer:

// resyncing in-memory copy each 10 minutes
factory := informers.NewSharedInformerFactory(client, 10*time.Minute)
podsInformer := factory.Core().V1().Pods().Informer()

The following code would start the above informer (and any other informer created by the factory) and wait until it gets a complete in-memory copy of your Pods:

stopCh := make(chan struct{})
factory.Start(stopCh) // runs in background
factory.WaitForCacheSync(stopCh)

Even after waiting for the cache synchronization, all the informers created by the factory would stay in background, updating the memory with any change in the cluster pods. You can close the stopCh to interrupt the background execution of all the informers.

By default, the informers for namespaced resources store them using the namespace/name string as key. You can retrieve any pod by its namespace and name in the following way:

// ignoring returned ok and err for brevity
podItem, _, _ := podsInformer.GetIndexer().GetByKey(namespace + "/" + name)
pod := podItem.(*corev1.Pod)
fmt.Println("The Pod IP is", pod.Status.PodIP)

Observe that, due to the lack of generics in Go, you still need to deal with some interface{} types.

Now imagine that, in addition to accessing your pods by name, you would like to get them indexed by IP address. You can add a new indexer before you start the Informer factory. A new Pod indexer would receive a *Pod instance and can return a list of string values that can be used as an index for such Pod. In our case, the list of IPs for this pod.

// arbitrary unique name for the new indexer
const ByIP = "IndexByIP"
podsInformer.AddIndexers(map[string]cache.IndexFunc{
    ByIP: func(obj interface{}) ([]string, error) {
        var ips []string
        for _, ip := range obj.(*corev1.Pod).Status.PodIPs {
            ips = append(ips, ip.IP)
        }
        return ips, nil
    },
})

When the informer is started, any new pod will be indexed in two ways: by its namespace/name and by any of its IP addresses.

Now, to retrieve a Pod by its IP, we need to ask it to the new IndexByIP index that we added previously:

items, err := podsInformer.GetIndexer().ByIndex(ByIP, ip)

Usually, the items array would return a zero-length array if there is not any pod with the passed IP, or a single-item array for most existing Pods.

However, for special cases like host-networked pods, which share the same Host IP, would return an array with all the Pods sharing the same IP.

In addition, you can tell the factory to create Informers for other resources, which will work analogous to Pods informers:

replicaSetInformer := factory.Apps().V1().ReplicaSets().Informer()
servicesInformer := factory.Core().V1().Services().Informer()

Conclusions

  • The Informers from the Go Kubernetes library help us with the boilerplate of having to keep an in-memory copy of our Kubernetes resources.
  • The Informers library is flexible enough to extend it for our own use cases, such as indexing by arbitrary fields, and even providing a different storage layer (not shown in this article).
  • When Go generics arrive, the Informers API could be improved to provide even cleaner code with type safety.

Deep Dive

Kubernetes Informers

I wanted to understand more about how Kubernetes controllers are implemented. Building controllers with controller-runtime is pretty easy but it masks many details on how the event-oriented architecture of client-go works underneath.

What are informers?

The vital role of a Kubernetes controller is to watch objects for the desired state and the actual state, then send instructions to make the actual state be more like the desired state. But how does the controller retrieve the object's information?

In order to retrieve an object's information, the controller sends a request to Kubernetes API server.

However, continuous polling for retrieving information on the resources can degrade the performance of the API server. To stay informed about when these events get triggered client-go provides Informers which solve this problem. Informers query the resource data and store it in a local cache. Once stored, an event is only generated when it detects a change in the object (or resource) state.

If you are confused on how this glues in to a controller, here is a diagram to explain the flow.

Client-go Controller Interaction

How Does It Work?

A single informer creates a local cache for itself. But in reality, a single resource could be watched by multiple controllers. And if each controller creates a cache for itself, there are synchronisation issues as multiple controllers have a watch on their own cache. client-go provides a Shared Informer which is used so that the cache is shared amongst all controllers. Every built-in Kubernetes resource has an Informer.

The informer mechanism has three components:

  • Reflector: Watches specific resources like certain CRD, and puts events, such as Added, Updated, and Deleted, into the local cache DeltaFIFO.
  • DeltaFIFO: A FIFO queue to store the related resource events.
  • Indexer: It is the local storage implemented by client-go, keeping consistent with the etcd, reducing the pressure of the API Server and etcd.

To create a shared informer you can use DynamicSharedInformerFactory function available in k8s.io/client-go/informer/dynamic/dynamicinformer package which returns a factory of the dynamic informers that can be created. A DynamicSharedInformerFactory provides access to a shared informer and lister for a dynamic client.

stopCh := make(chan struct{})
defer close(stopCh)

informer := dynamicinformer.NewFilteredDynamicSharedInformerFactory(
				dynamicClient,
				resyncPeriod,
				namespace,
				nil,
			)

Let's break this code down. This informer is created for a particular namespace. If you'd like to watch all namespaces, you can set this to an empty string. Once the informer is set, you need to add an event handler to inject the logic that you'd like to execute when an object is Added/Updated/Deleted. You can do that easily:

var handler cache.ResourceEventHandlerFuncs
handler.AddFunc = func(obj interface{}) {
	log.Info("add event")
}
handler.UpdateFunc = func(old, new interface{}) {
	log.Info("update event")
}
handler.DeleteFunc = func(obj interface{}) {
	log.Info("delete event")
}

informer.AddEventHandler(handler)

informer.Start(stopCh)

Pretty simple right? Once this is done, you are ready to watch for events in the specified namespaces. Understanding the working layers of client-go gives you tools to interact with the API server and create custom tools for yourself. One such tool that I've written is called kubediff

Writing Your Own Informers

kubediff is a Kubernetes resource diff watcher, with the ability to send event notifications to slack/webhooks. Using the same informer logic, kubediff watches for all resources (including CRDs) and generated events, which you can send to a webhook or slack channel. It logs the events in a JSON format, so that makes it easy for you to send the logs directly to your preferred logging stack and then view the diff in objects whenever they are updated.

You can also create a watch on a single namespace, or multiple namespaces. If you want update on all the events, you can simply run kubediff in watch mode and it will update you for all the events when an object is Created/Deleted/Updated.

If you'd like to understand more about informers and play with it, do checkout out kubediff, clone/fork and tweak with the informer settings to checkout the event-driven mechanism implemented in client-go.

GitHub - arriqaaq/kubediff: A Kubernetes Resource Diff