Kubernetes Advanced Pod Scheduling: Selectors, Affinity & Taints

Welcome to this challenge on advanced Kubernetes Pod scheduling. You will gain hands-on experience controlling workload placement using nodeSelector, various forms of affinity and anti-affinity, and taints with tolerations. Mastering these mechanisms is crucial for optimizing resource utilization, enhancing high availability, and fulfilling specific application deployment requirements.

Core Scheduling Concepts

Before we begin, let's briefly touch upon the mechanisms you'll be using:

nodeSelector: The simplest way to constrain Pods to nodes with specific labels. You define key-value pairs in the Pod spec, and the Pod will only be scheduled on nodes that have all those labels.
Node Affinity/Anti-Affinity: A more expressive way to select nodes. It allows for "preferred" vs. "required" rules and more complex matching logic (e.g., In, NotIn, Exists).
Pod Affinity/Anti-Affinity: Allows you to make scheduling decisions for a Pod based on other Pods already running on nodes. For example, co-locating application tiers or spreading replicas across different failure domains.
Taints and Tolerations: Taints are applied to nodes to "repel" Pods. A Pod must have a matching "toleration" to be scheduled on a tainted node. This is often used for dedicating nodes to specific workloads or for cordoning nodes.

All these mechanisms influence the Kubernetes scheduler's decision-making process.

For tasks requiring Pod creation/modification, partially completed YAML manifest files are available in the /home/laborant/challenge/ directory (e.g., task1-pod.yaml). You will need to edit these files to add the required scheduling specifications and then apply them using kubectl apply -f <filename.yaml>.

Let's start placing our Pods strategically!

Our first task is to ensure a Pod lands on a node with specific hardware characteristics - identified by a label.

Hint 1.1 💡

How do you instruct a Pod to only consider nodes that possess certain labels? Look for a field in the Pod specification designed for simple key-value label matching for node selection.

Hint 1.2 💡

The scheduling constraint you need requires specifying the target label as a key-value pair within the Pod's specification. Consult the Kubernetes documentation for Pod specs if you're unsure where simple node label constraints are defined. Pay attention to YAML structure.

You successfully scheduled task1-pod based on node labels. Now, confirm its location.

Hint 2.1 💡

Which kubectl get pod output format shows extra information, including the node name where the pod is running?

Hint 2.2 💡

Besides the standard get command output options, which other kubectl command provides very detailed, multi-section information about a specific Pod, including its placement?

Hint 2.3 💡

For more targeted retrieval, kubectl get allows extracting specific data fields using template formats like JSONPath. How could you query just the field related to the assigned node name?

Now, let's prepare node-02 to be dedicated for a specific type of workload, like a database. We need to prevent regular application pods (those without special configuration) from being scheduled onto this node.

Hint 3.1 💡

There's a specific kubectl command used to apply or remove scheduling restrictions from nodes. What is it?

Hint 3.2 💡

The restriction needs to be specified in the format key=value:effect. Ensure you use the correct key (role), value (database), and effect (NoSchedule) required by the task to mark the node appropriately.

You've marked node-02 to repel general pods using a taint (role=database:NoSchedule). Now, deploy the task4-pod, ensuring it can run on this specially designated node.

Hint 4.1 💡

When a node has a taint applied (like role=database:NoSchedule), pods normally avoid it. How can a Pod declare that it is "willing" to ignore or tolerate a specific node taint?

A web-pod application has been pre-deployed for you. It should be running in the cluster. Your task is to inspect this pod and identify the label attached to it.

Hint 5.1 💡

To find a Pod's labels, you'll need to inspect its configuration or metadata. Which kubectl commands allow you to view these details for a running Pod?

Hint 5.2 💡

Some kubectl commands provide a very comprehensive overview of a Pod, including all its labels and other details.

Hint 5.3 💡

Remember to provide your answer in the format key=value. For example, environment=production.

Your web-pod is running and correctly labeled. Now, deploy a cache-pod and ensure it always runs on the same node as the web-pod. This is a common pattern for applications where network latency between components is critical.

Hint 6.1 💡

When you need to control a Pod's placement based on other Pods already running (e.g., placing them together or apart), what general scheduling feature in Kubernetes allows this?

Hint 6.2 💡

For a strict requirement that the cache-pod must run with the web-pod, you'll need to define a "required" rule within the relevant specification. How do you specify which pods to be attracted to?

Hint 6.3 💡

These rules also need to understand the "scope" or "domain" of placement (e.g., "same node," "same zone"). What field defines this scope, and what value represents a single host machine?