Making Sense Out of Native Sidecar Containers in Kubernetes

Kubernetes 1.28 introduced a new type of container - "native" sidecars. Why this addition was needed? How do the new sidecar containers compare to regular and init containers? What use cases does this new type of container enable? And most importantly, how can I get my hands dirty with the hot new feature? Let's find out!

First, a potentially ~~disturbing~~ surprising fact: the 1.28 release didn't add a new Sidecar type or even a sidecarContainers field to the Pod spec - the so-called "native sidecar" containers are just the same old init containers but with some new properties, which make them behave unlike any other type of containers. At first glance, this may seem like a bad design decision, but when you wrap your head around it, you'll see that it's actually a clever and future-proof solution to a rather delicate architectural problem.

Before diving into the details of the new sidecar containers, let's quickly recap the original difference between the regular and init containers in Kubernetes.

Level up your Server Side game — Join 11,000 engineers who receive insightful learning materials straight to their inbox

Regular vs. init containers

Every Pod must have at least one regular container defined in its .spec.containers list. When a Pod has multiple regular containers, they all start and run concurrently, and if some of the Pod's containers terminate, they become subject to the Pod-wide restart policy:

pod-01.yaml

apiVersion: v1
kind: Pod
metadata:
  name: pod-01
spec:
  containers:
  - name: app
    image: app:latest
  - name: cache
    image: cache:latest
  restartPolicy: Always | OnFailure | Never

Multiple concurrent and restartable containers deployed together as a single "unit" is a powerful abstraction, but there are also situations when some of the containers in such a group may need to:

Start before others (i.e., the startup order becomes important).
Run to completion (i.e., restart upon termination is undesirable).

Enforcing the startup order of regular containers is tricky but doable - a bunch of ugly-looking shell scripts usually do the trick. However, running some of the containers to completion while keeping others restartable is a much harder problem to solve. Simply setting the restart policy to OnFailure is not good enough because when one of the regular containers terminates, the Pod stops being ready, meaning that it's no longer able to serve traffic even if its primary container is still up and running.

A proper solution was needed to address the above problems, and that's how the init containers were born.

Since a while ago, the Kubernetes Pod spec has gotten another list - .spec.initContainers:

pod-02.yaml

apiVersion: v1
kind: Pod
metadata:
  name: pod-02
spec:
  initContainers:
  - name: config-loader
    image: config-loader:latest
  containers:
  - name: app
    image: app:latest
  - name: cache
    image: cache:latest
  restartPolicy: Always | OnFailure | Never

Even though the elements of this new list have a literally identical set of attributes to the regular containers, the init containers behave differently.

Init containers:

Triggered before regular containers.
Started one by one in the order defined by their list.
Run to completion before the next init container starts.

...and because of the above design, init containers:

Can't respect the Pod's restart policy if it's set to Always - init containers then fallback to OnFailure.
After their (successful) termination, don't affect the Pod's readiness - otherwise, the Pod with init containers would never become ready.
Don't allow specifying the startup, liveness, and readiness probes - because they would be useless anyway.

Sidecar containers

Historically, init containers were used to perform some auxiliary "one-off" tasks before the main application container startup:

Retrieving secrets and establishing the Pod's identity.
Fetching application configuration from a remote source.
Running scripts using tools and privileges that are too insecure for a long-running container.
etc.

But there is more auxiliary functionality that is beneficial to keep outside of the main application container but still in the same Pod:

Network proxies, adapters, and data transformers.
Logging, metrics, and tracing collectors.
Various monitoring agents.

And such functionality doesn't fit the init container model because the above containers:

Should start before the main container(s).
Have to live as long as the main application runs.
May or may not need to affect the Pod's readiness.
Should not block the Pod's termination.

Expectedly, over time, another architectural pattern, perfectly describing the above requirements, emerged - Sidecar Containers. However, up until the 1.28 release, sidecars in Kubernetes were implemented using regular containers. But, since there is no startup order guarantee for regular containers, and there is just one restart policy to rule them all, engineers had to come up with various workarounds to make sidecar containers behave as expected.

So, how has Kubernetes 1.28 changed the situation?

Native sidecar containers

The 1.28 release didn't add a new Sidecar type or even a .spec.sidecarContainers list. Instead, it introduced a new restartPolicy attribute for... containers!

In addition to the Pod-wide restart policy, now containers can have their own restart policy, but only if:

It's an init container.
The value is set to Always.

pod-03.yaml

apiVersion: v1
kind: Pod
metadata:
  name: pod-03
spec:
  initContainers:
  - name: proxy
    image: envoy:latest
    restartPolicy: Always                           # Required
    startupProbe | readinessProbe | livenessProbe:  # Optional
      ...
  - name: config-loader
    image: config-loader:latest
  containers:
  - name: app
    image: app:latest
  - name: cache
    image: cache:latest
  restartPolicy: Always | OnFailure | Never

So, what's the difference between traditional init and init containers with the restartPolicy: Always attribute?

The new type of init containers:

Don't block the next (init or regular) container startup.
Restarted upon termination.
Support startup, readiness, and liveness probes!
Terminate after all regular containers are done.

In other words, the only "true init" thing about the new type of containers is that they still respect the startup order. The rest of the behavior seems very different (if not opposite) to the traditional init logic. However, the new behavior is exactly what one would expect from a Sidecar container.

Yes, it's not the most obvious way to achieve the desired behavior. But this design paves the way to a more advanced type of containers - KEP-753: Sidecar containers explains the motivation for reusing the initContainers list and even mentions a new type of containers called infrastructureContainers, which might be used to unify the behavior of the old init and new Sidecar containers in the future:

For this KEP it is important to have sidecar containers be defined among other init containers to be able to express the initialization order of containers. The name initContainers is not a good fit for sidecar containers as they typically do more than initialization. The better name can be “infrastructure” containers. The current idea is to implement sidecars as a part of initContainers and if this introduces too much trouble, the new collection name may replace the old collection name in future.

Here's how the new behavior can be visualized using a single Pod with carefully crafted containers:

Example Pod with native "sidecar" containers 📝

apiVersion: v1
kind: Pod
metadata:
  name: example-pod
spec:
  restartPolicy: OnFailure
  initContainers:
  - name: ic1
    image: alpine:3
    command: [ "sh", "-c", "sleep 5" ]
  - name: sc1
    restartPolicy: Always
    image: alpine:3
    command: [ "sh", "-c", "sleep 20 && exit 1" ]
  - name: ic2
    image: alpine:3
    command: [ "sh", "-c", "sleep 5" ]
  - name: sc2
    restartPolicy: Always
    image: alpine:3
    command: [ "sh", "-c", "while true; do sleep 1; done" ]
    startupProbe:
      exec:
        command: ["/bin/sh", "-c", "sleep 5"]
      initialDelaySeconds: 0
      timeoutSeconds: 999
    readinessProbe:
      exec:
        command: ["/bin/sh", "-c", "sleep 5"]
      initialDelaySeconds: 0
      timeoutSeconds: 999
  containers:
  - name: rc1
    image: alpine:3
    command: [ "sh", "-c", "sleep 55" ]
    startupProbe:
      exec:
        command: ["/bin/sh", "-c", "sleep 5"]
      initialDelaySeconds: 0
      periodSeconds: 30
      timeoutSeconds: 999
    readinessProbe:
      exec:
        command: ["/bin/sh", "-c", "sleep 5"]
      initialDelaySeconds: 0
      periodSeconds: 30
      timeoutSeconds: 999
  - name: rc2
    image: alpine:3
    command: [ "sh", "-c", "sleep 60" ]
    startupProbe:
      exec:
        command: ["/bin/sh", "-c", "sleep 5"]
      initialDelaySeconds: 0
      periodSeconds: 30
      timeoutSeconds: 999
    readinessProbe:
      exec:
        command: ["/bin/sh", "-c", "sleep 5"]
      initialDelaySeconds: 0
      periodSeconds: 30
      timeoutSeconds: 999