Lesson  in  Kubernetes the (Very) Hard Way

kube-scheduler

Overview

In this lesson, you'll explore kube-scheduler, the control plane component responsible for assigning newly created Pods to Nodes in a Kubernetes cluster.

Objectives:

  • Understand kube-scheduler's role within a Kubernetes cluster
  • Explore how Pods are assigned to Nodes (manually and via Binding)
  • Install and configure kube-scheduler from scratch
  • Observe automated Pod scheduling in action

By the end of this lesson, you'll understand how kube-scheduler watches for unscheduled Pods and assigns them to the most suitable Nodes.

🐛 Reporting issues

If you encounter any issues throughout the course, please report them here.

What is kube-scheduler?

What's the difference between Kubernetes and Docker?

This is one of the most common questions asked by newcomers to Kubernetes.

Kubernetes is a container orchestration platform that manages the deployment, scaling, and operation of containerized applications across a cluster of machines.

This multi-machine capability (among other things) is what sets Kubernetes apart from single-host container runtimes like Docker. Kubernetes can distribute workloads across multiple nodes, ensuring applications are deployed and scaled efficiently and reliably.

However, Kubernetes cannot randomly decide where to place workloads. It needs to make informed decisions, otherwise its actions could lead to resource contention, performance degradation, or application failure.

In Kubernetes, scheduling refers to making sure that Pods are matched to Nodes so that the kubelet can run them.

A scheduler watches for newly created Pods that have no assigned Node. For every unscheduled Pod, the scheduler becomes responsible for finding the best node for that Pod to run on.

💡 How the best node is determined depends on what factors the scheduler considers.

Typically, the scheduler considers factors such as resource availability, node affinity, pod priorities, and more to make optimal scheduling decisions.

kube-scheduler is the default scheduler for Kubernetes and runs as part of the control plane.

💡 Notice that kube-scheduler is the default scheduler.

Kubernetes allows users to implement their own schedulers if they want to customize the scheduling process.

kube-scheduler in a Kubernetes cluster

kube-scheduler in a Kubernetes cluster

Assigning Pods to Nodes

Before installing kube-scheduler, let's explore how Kubernetes assigns Pods to Nodes.

Create a new Pod to see what happens when there is no scheduler:

kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: podinfo-unscheduled
spec:
  automountServiceAccountToken: false
  containers:
    - name: podinfo
      image: ghcr.io/stefanprodan/podinfo:latest
EOF

At this point, the Pod exists but remains unscheduled (it has no assigned Node):

kubectl get pod podinfo-unscheduled -o jsonpath='{.spec.nodeName}'

Since there is no scheduler installed, this isn't going to change: the Pod will remain unscheduled.

It is possible to assign a Pod to a specific Node by setting the nodeName field directly in the Pod specification, essentially bypassing the scheduler.

But in order to do so, you need an existing Node in the cluster.

Create a (fake) Node object that will serve as the scheduling target:

kubectl apply -f - <<EOF
apiVersion: v1
kind: Node
metadata:
  name: worker-01
status:
  capacity:
    cpu: "4"
    memory: 8Gi
    pods: "110"
  allocatable:
    cpu: "4"
    memory: 8Gi
    pods: "110"
  conditions:
    - type: Ready
      status: "True"
      reason: KubeletReady
      message: Node is ready
EOF

💡 This is yet another proof that the kube-apiserver is a plain old REST API server: it doesn't care whether a real machine backs the Node object, it just stores the data.

Create a new Pod and assign it to the Node:

kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: podinfo-node-name
spec:
  nodeName: worker-01
  automountServiceAccountToken: false
  containers:
    - name: podinfo
      image: ghcr.io/stefanprodan/podinfo:latest
EOF

Verify that the Pod was assigned to the selected Node:

kubectl get pod podinfo-node-name -o jsonpath='{.spec.nodeName}'

While this approach works when you need to place a Pod on a specific node, manually assigning Pods to Nodes isn't scalable, may not be possible in all scenarios, and largely defeats the purpose of container orchestration.

However, there's a more fundamental problem with regards to scheduling: Pods are (mostly) immutable objects. Once created, a Pod's specification cannot be modified, including the nodeName field.

This immutability raises an important question: if a Pod is created without a nodeName (unscheduled) and Pods cannot be modified after creation, how does a scheduler assign it to a Node?

The answer lies in a special Kubernetes resource called Binding. It is an internal Pod subresource that provides a dedicated API path for assigning Pods to Nodes. Instead of requiring a full Pod update, it allows a targeted operation on the Pod's /binding subresource, which sets the Pod's spec.nodeName field.

Since kubectl doesn't support Binding creation directly (it's an internal operation), use the Kubernetes API directly to create a Binding object for the previously created Pod:

curl -f -k https://127.0.0.1:6443/api/v1/namespaces/default/pods/podinfo-unscheduled/binding \
    --cacert /etc/kubernetes/pki/ca.crt \
    --cert /etc/kubernetes/pki/admin.crt \
    --key /etc/kubernetes/pki/admin.key \
    -X POST \
    -H "Content-Type: application/json" \
    -d '{
      "apiVersion": "v1",
      "kind": "Binding",
      "metadata": {
        "name": "podinfo-unscheduled"
      },
      "target": {
        "apiVersion": "v1",
        "kind": "Node",
        "name": "worker-01"
      }
    }'

💡 Reminder: the API server uses TLS for secure communication AND authentication (as explained in the previous lesson).

Verify that the Binding successfully assigned the Pod to the Node:

kubectl get pod podinfo-unscheduled -o jsonpath='{.spec.nodeName}'

Check the status of both Pods to see the final result:

kubectl get pods -o wide

💡 Even though pods are assigned, since there is no real node (kubelet) behind the Node object, the pods remain in Pending state.

Installing kube-scheduler

Follow these steps to install kube-scheduler:

Set the version to install:

KUBE_VERSION=v1.34.0

Download and install kube-scheduler from the official Kubernetes releases:

curl -fsSLO https://dl.k8s.io/${KUBE_VERSION?}/bin/linux/amd64/kube-scheduler
sudo install -m 755 kube-scheduler /usr/local/bin

Download the systemd unit file to configure kube-scheduler as a systemd service:

sudo wget -O /etc/systemd/system/kube-scheduler.service https://labs.iximiuz.com/content/files/courses/kubernetes-the-very-hard-way-0cbfd997/03-control-plane/03-kube-scheduler/__static__/kube-scheduler.service?v=1772116343

Before you can start the kube-scheduler service, you need to configure authentication, so it can communicate with the kube-apiserver.

💡 Configuring authentication was covered in the previous lesson.

The steps are exactly the same as configuring kubectl.

Generate a certificate and key for kube-scheduler:

(
cd /etc/kubernetes/pki

sudo openssl genrsa -out scheduler.key 2048
sudo openssl req -new -key scheduler.key -out scheduler.csr -subj "/CN=system:kube-scheduler"
sudo openssl x509 -req -in scheduler.csr -out scheduler.crt \
  -CA ca.crt -CAkey ca.key \
  -days 365
)

Create a kubeconfig file for kube-scheduler:

sudo kubectl config set-cluster default \
    --kubeconfig=/etc/kubernetes/scheduler.conf \
    --certificate-authority=/etc/kubernetes/pki/ca.crt \
    --embed-certs=true \
    --server=https://127.0.0.1:6443

sudo kubectl config set-credentials default \
    --kubeconfig=/etc/kubernetes/scheduler.conf \
    --client-certificate=/etc/kubernetes/pki/scheduler.crt \
    --client-key=/etc/kubernetes/pki/scheduler.key \
    --embed-certs=true

sudo kubectl config set-context default \
    --kubeconfig=/etc/kubernetes/scheduler.conf \
    --cluster=default \
    --user=default

sudo kubectl config use-context default \
    --kubeconfig=/etc/kubernetes/scheduler.conf

Reload the systemd daemon and start the kube-scheduler service:

sudo systemctl daemon-reload
sudo systemctl enable --now kube-scheduler

Congratulations! You have successfully installed kube-scheduler. 🎉

Scheduling Pods

With kube-scheduler installed, Pods can now be scheduled automatically onto Nodes in the cluster. Behind the scenes, kube-scheduler detects unscheduled Pods, evaluates available Nodes, and creates a Binding to assign each Pod to the best candidate.

Before testing, there's one thing to take care of. The fake Node you created earlier automatically got a node.kubernetes.io/not-ready taint because no real kubelet is reporting in.

The scheduler won't place Pods on tainted Nodes, so remove the taint first:

kubectl taint node worker-01 node.kubernetes.io/not-ready-

Create a new Pod to test scheduling:

kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: podinfo-scheduler
spec:
  automountServiceAccountToken: false
  containers:
    - name: podinfo
      image: ghcr.io/stefanprodan/podinfo:latest
EOF

Verify that the Pod was assigned to a Node:

kubectl get pod podinfo-scheduler -o jsonpath='{.spec.nodeName}'

Summary

In this lesson, you learned about kube-scheduler, the default scheduler component that assigns Pods to Nodes in a Kubernetes cluster.

Key takeaways:

  • Pod placement: kube-scheduler watches for newly created Pods without assigned Nodes and makes informed decisions about optimal Node placement based on resource availability, node affinity, pod priorities, and other factors
  • Binding mechanism: Since Pods are mostly immutable, kube-scheduler uses the internal Binding subresource to set a Pod's spec.nodeName field through a dedicated API path, rather than requiring a full Pod update
  • Extensible design: As the default scheduler, kube-scheduler can be replaced with custom schedulers when specialized scheduling logic is required

With kube-scheduler now running alongside kube-apiserver, you have established the core scheduling capability that enables Kubernetes to automatically distribute workloads across your cluster nodes.

Kubernetes scheduling is much more involved than what's covered here. Check out the references for a deeper look at the scheduling framework.

Related Content

💡 To learn more about the concepts covered in this lesson, check out the resources below.

🧪 Playgrounds

Previous lesson
kube-apiserver