Making Sense Out of Native Sidecar Containers in Kubernetes

Kubernetes 1.28 introduced a new type of container - "native" sidecars. Why this addition was needed? How do the new sidecar containers compare to regular and init containers? What use cases does this new type of container enable? And most importantly, how can I get my hands dirty with the hot new feature? Let's find out!

First, a potentially disturbing surprising fact: the 1.28 release didn't add a new Sidecar type or even a sidecarContainers field to the Pod spec - the so-called "native sidecar" containers are just the same old init containers but with some new properties, which make them behave unlike any other type of containers. At first glance, this may seem like a bad design decision, but when you wrap your head around it, you'll see that it's actually a clever and future-proof solution to a rather delicate architectural problem.

Before diving into the details of the new sidecar containers, let's quickly recap the original difference between the regular and init containers in Kubernetes.

Level up your server-side game β€” join 6,000 engineers getting insightful learning materials straight to their inbox.

    Regular vs. init containers

    Every Pod must have at least one regular container defined in its .spec.containers list. When a Pod has multiple regular containers, they all start and run concurrently, and if some of the Pod's containers terminate, they become subject to the Pod-wide restart policy:

    pod-01.yaml
    apiVersion: v1
    kind: Pod
    metadata:
      name: pod-01
    spec:
      containers:
      - name: app
        image: app:latest
      - name: cache
        image: cache:latest
      restartPolicy: Always | OnFailure | Never
    

    Multiple concurrent and restartable containers deployed together as a single "unit" is a powerful abstraction, but there are also situations when some of the containers in such a group may need to:

    • Start before others (i.e., the startup order becomes important).
    • Run to completion (i.e., restart upon termination is undesirable).

    Enforcing the startup order of regular containers is tricky but doable - a bunch of ugly-looking shell scripts usually do the trick. However, running some of the containers to completion while keeping others restartable is a much harder problem to solve. Simply setting the restart policy to OnFailure is not good enough because when one of the regular containers terminates, the Pod stops being ready, meaning that it's no longer able to serve traffic even if its primary container is still up and running.

    A proper solution was needed to address the above problems, and that's how the init containers were born.

    Since a while ago, the Kubernetes Pod spec has gotten another list - .spec.initContainers:

    pod-02.yaml
    apiVersion: v1
    kind: Pod
    metadata:
      name: pod-02
    spec:
      initContainers:
      - name: config-loader
        image: config-loader:latest
      containers:
      - name: app
        image: app:latest
      - name: cache
        image: cache:latest
      restartPolicy: Always | OnFailure | Never
    

    Even though the elements of this new list have a literally identical set of attributes to the regular containers, the init containers behave differently.

    Init containers:

    • Triggered before regular containers.
    • Started one by one in the order defined by their list.
    • Run to completion before the next init container starts.

    ...and because of the above design, init containers:

    • Can't respect the Pod's restart policy if it's set to Always - init containers then fallback to OnFailure.
    • After their (successful) termination, don't affect the Pod's readiness - otherwise, the Pod with init containers would never become ready.
    • Don't allow specifying the startup, liveness, and readiness probes - because they would be useless anyway.

    Sidecar containers

    Historically, init containers were used to perform some auxiliary "one-off" tasks before the main application container startup:

    • Retrieving secrets and establishing the Pod's identity.
    • Fetching application configuration from a remote source.
    • Running scripts using tools and privileges that are too insecure for a long-running container.
    • etc.

    But there is more auxiliary functionality that is beneficial to keep outside of the main application container but still in the same Pod:

    • Network proxies, adapters, and data transformers.
    • Logging, metrics, and tracing collectors.
    • Various monitoring agents.

    And such functionality doesn't fit the init container model because the above containers:

    • Should start before the main container(s).
    • Have to live as long as the main application runs.
    • May or may not need to affect the Pod's readiness.
    • Should not block the Pod's termination.

    Expectedly, over time, another architectural pattern, perfectly describing the above requirements, emerged - Sidecar Containers. However, up until the 1.28 release, sidecars in Kubernetes were implemented using regular containers. But, since there is no startup order guarantee for regular containers, and there is just one restart policy to rule them all, engineers had to come up with various workarounds to make sidecar containers behave as expected.

    So, how has Kubernetes 1.28 changed the situation?

    Native sidecar containers

    The 1.28 release didn't add a new Sidecar type or even a .spec.sidecarContainers list. Instead, it introduced a new restartPolicy attribute for... containers!

    In addition to the Pod-wide restart policy, now containers can have their own restart policy, but only if:

    • It's an init container.
    • The value is set to Always.
    pod-03.yaml
    apiVersion: v1
    kind: Pod
    metadata:
      name: pod-03
    spec:
      initContainers:
      - name: proxy
        image: envoy:latest
        restartPolicy: Always                           # Required
        startupProbe | readinessProbe | livenessProbe:  # Optional
          ...
      - name: config-loader
        image: config-loader:latest
      containers:
      - name: app
        image: app:latest
      - name: cache
        image: cache:latest
      restartPolicy: Always | OnFailure | Never
    

    So, what's the difference between traditional init and init containers with the restartPolicy: Always attribute?

    The new type of init containers:

    • Don't block the next (init or regular) container startup.
    • Restarted upon termination.
    • Support startup, readiness, and liveness probes!
    • Terminate after all regular containers are done.

    In other words, the only "true init" thing about the new type of containers is that they still respect the startup order. The rest of the behavior seems very different (if not opposite) to the traditional init logic. However, the new behavior is exactly what one would expect from a Sidecar container.

    Yes, it's not the most obvious way to achieve the desired behavior. But this design paves the way to a more advanced type of containers - KEP-753: Sidecar containers explains the motivation for reusing the initContainers list and even mentions a new type of containers called infrastructureContainers, which might be used to unify the behavior of the old init and new Sidecar containers in the future:

    For this KEP it is important to have sidecar containers be defined among other init containers to be able to express the initialization order of containers. The name initContainers is not a good fit for sidecar containers as they typically do more than initialization. The better name can be β€œinfrastructure” containers. The current idea is to implement sidecars as a part of initContainers and if this introduces too much trouble, the new collection name may replace the old collection name in future.

    Here's how the new behavior can be visualized using a single Pod with carefully crafted containers:

    Example Pod with native "sidecar" containers πŸ“
    apiVersion: v1
    kind: Pod
    metadata:
      name: example-pod
    spec:
      restartPolicy: OnFailure
      initContainers:
      - name: ic1
        image: alpine:3
        command: [ "sh", "-c", "sleep 5" ]
      - name: sc1
        restartPolicy: Always
        image: alpine:3
        command: [ "sh", "-c", "sleep 20 && exit 1" ]
      - name: ic2
        image: alpine:3
        command: [ "sh", "-c", "sleep 5" ]
      - name: sc2
        restartPolicy: Always
        image: alpine:3
        command: [ "sh", "-c", "while true; do sleep 1; done" ]
        startupProbe:
          exec:
            command: ["/bin/sh", "-c", "sleep 5"]
          initialDelaySeconds: 0
          timeoutSeconds: 999
        readinessProbe:
          exec:
            command: ["/bin/sh", "-c", "sleep 5"]
          initialDelaySeconds: 0
          timeoutSeconds: 999
      containers:
      - name: rc1
        image: alpine:3
        command: [ "sh", "-c", "sleep 55" ]
        startupProbe:
          exec:
            command: ["/bin/sh", "-c", "sleep 5"]
          initialDelaySeconds: 0
          periodSeconds: 30
          timeoutSeconds: 999
        readinessProbe:
          exec:
            command: ["/bin/sh", "-c", "sleep 5"]
          initialDelaySeconds: 0
          periodSeconds: 30
          timeoutSeconds: 999
      - name: rc2
        image: alpine:3
        command: [ "sh", "-c", "sleep 60" ]
        startupProbe:
          exec:
            command: ["/bin/sh", "-c", "sleep 5"]
          initialDelaySeconds: 0
          periodSeconds: 30
          timeoutSeconds: 999
        readinessProbe:
          exec:
            command: ["/bin/sh", "-c", "sleep 5"]
          initialDelaySeconds: 0
          periodSeconds: 30
          timeoutSeconds: 999
    

    Get your hands dirty

    And now, the most interesting part! How else can you internalize the new concept if not by solving a few practical problems?

    Exercise 1: Faulty init sequence

    There is a faulty Pod in the exercise-01 namespace. It's desperately trying to start up but can't get far enough in its lifecycle. Apparently, a new container was added to the Pod recently, and after that, the Pod stopped working. Can you identify what causes the Pod to fail and fix it? Note that you are not allowed to change the Pod images or their contents, but you can freely change the rest of the Pod spec.

    πŸ’‘ Hint 1

    It's a good idea to start by reviewing the Pod spec. Take a close look at the .spec.containers and .spec.initContainers lists. Make sure you understand the .status.containerStatuses and .status.initContainerStatuses fields. You can use either kubectl get pod faulty -n exercise-01 -o yaml or the "Show me the Pod" button above.

    πŸ’‘ Hint 2

    Try looking at container logs. You can use kubectl logs -n exercise-01 faulty -c <container-name> for that. It may shed some light on why some of the containers are failing.

    πŸ’‘ Hint 3

    While it's not always possible to sneak peek into the containerized application, in this case you can do that. All containers are simple Python scripts, and you can see their source code by running kubectl exec -n exercise-01 faulty -c <container-name> -- cat /app/<script-name>.py.

    πŸ’‘ Hint 4

    Remember that init containers are started before the regular containers. Does it look like the new init container may depend on a regular container? Should you try moving that regular container to the init list?

    πŸ’‘ Hint 5

    Traditional init containers not just start in the order but also always run to completion. If the second init container expects the first one to be running, maybe it's time to try the new restartPolicy: Always attribute? πŸ˜‰

    Can you make all checks below pass?

    Exercise 2: Sleepy sidecar

    There is another Pod named sleepy that lives in the exercise-02 namespace. Much like the previous one, it experiences some trouble starting up. However, this time, the problem seems to be slightly different. Can you figure out what's wrong with sleepy and help it achieve the Ready condition?

    πŸ’‘ Hint 1

    As usual, it's a good idea to start by reviewing the Pod spec and the container logs. Take a close look at the .spec.containers and .spec.initContainers lists. Make sure you understand the .status.containerStatuses and .status.initContainerStatuses fields. Examine the logs of the sleepy-sidecar and app containers.

    πŸ’‘ Hint 2

    The sleepy-sidecar container is supposed not to just start before the app container but also to be fully functional.

    πŸ’‘ Hint 3

    Sidecar containers have introduced a new container-level restartPolicy attribute. What other new attributes do they bring?

    πŸ’‘ Hint 4

    Sometimes, an exec probe with curl (or any other HTTP client) can be used instead of an httpGet probe to check the container's readiness endpoint. And for containers listening on localhost, it might be the only option.

    Can you make these checks pass too?

    Level up your server-side game β€” join 6,000 engineers getting insightful learning materials straight to their inbox.