Pod scheduling for Kubernetes: make your apps like flying first class!

I. The Importance of Pod Scheduling in Kubernetes

In the Kubernetes world, Pod scheduling is like a busy traffic commander, responsible for sending the carts (i.e., our Pods) to the most appropriate parking spaces (nodes). Scheduling is not only about resource utilization, but also about the survival of the application, so let's take a look at why scheduling is so important.

Resource optimization:: Imagine a parking lot disaster if every car was parked randomly! The dispatcher ensures that each Pod finds the right parking space through precise "parking navigation", maximizing the use of resources and avoiding "overcrowding".
Fault recoveryLet's say a node fails, like a car suddenly breaking down. The scheduler reacts quickly and sends the faulty Pod to another healthy node, ensuring that your application doesn't "crash".
load balancing: Imagine a traffic jam where all cars are parked on the same side and the other side is empty. The scheduler will smartly spread the Pod across the nodes, keeping the load even, like a harmonious dance.
strategy implementationThe Kubernetes scheduler is more than just a simple commander, it has its own set of "scheduling rules". Through mechanisms such as affinity, anti-affinity, tainting, and tolerance, the scheduler ensures that each Pod finds the ideal place to live according to its own "preferences" and that nothing goes wrong.
scalability: When your application is rapidly expanding like a balloon, the flexibility of the scheduler is especially important. It can easily respond to changes in load, dynamically expanding and contracting to ensure that everything runs smoothly.

In short, Pod scheduling in Kubernetes is like a silent hero working in the background to ensure that applications are efficient, secure, and stable. Understanding the scheduling mechanism allows you to navigate this containerized world and is simply a must-have skill!

II. Node Selector

Definitions and Usage

The Node Selector, like a careful "picky" friend, specializes in helping you choose the most appropriate party venue. In Kubernetes, the Node Selector is used to tell the scheduler that a pod needs to run on a specific node. In this way, you can make sure that your application is "shining" in the most appropriate environment.

Imagine your application is a superstar that wants to perform on a node with a high-performance graphics card, not on a lesser-configured machine, and Node Selector is designed to fulfill that need, allowing your pods to perform on the right stage for them.

typical example

Let's say you have an application that needs a lot of disk io power and you want to put it on a "disk guru" node.

Labeling of nodes

Let's start by setting a disk=ssd label for node01

kubectl label nodes  disktype=ssd

Check out the tags.

kubectl get nodes -l disktype=ssd

NAME STATUS ROLES AGE VERSION
Ready <none> 161d v1.30.0

Then use the Node Selector to specify the label of the node. Example:

#
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 2 # set up Pod The number of copies is 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx-container
        image: nginx:latest
        ports:
        - containerPort: 80
      nodeSelector:
        disktype: ssd

Creating a Deployment

kubectl apply -f  
/nginx-deployment created

kubectl get po -o wide 
NAME                                READY   STATUS    RESTARTS   AGE   IP            NODE               NOMINATED NODE   READINESS GATES
nginx-deployment-56f59878c8-cgfsx   1/1     Running   0          8s    10.244.1.32      <none>           <none>
nginx-deployment-56f59878c8-z64rz   1/1     Running   0          8s    10.244.1.31      <none>           <none>

In this example, the Pod would be dispatched to a file labeleddisktype: ssd on the nodes. This way, your superstars can shine in the best possible environment instead of struggling on the average node!

So, Node Selector is your "picky friend", helping you find the right "stage" for your pods and making sure they reach their full potential.

III. Affinity and counter-affinity

1. Affinity

define

Affinity, which sounds like a young man in love, is actually a little helper in Kubernetes that helps you choose which node a Pod will "date" on. With affinity rules, Pods can be dispatched to nodes with specific labels, as if they were choosing a suitable place to date!

typology

1. Node Affinity:

Like finding the right house in a big city, node affinity lets you schedule Pods to nodes with specific labels. For example, you might want to schedule a Pod on a node with SSD drives because performance is better there.

2. Pod Affinity:

Pod affinity is useful if you want some Pods to live together and take care of each other. It allows you to dispatch new Pods to nodes where a Pod already exists, creating a "big family".

typical example

Here's a YAML file that deploys Nginx, using node affinity to ensure that the Pod is deployed in an environment with adisktype=ssd labeled nodes to run on:

# 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: disktype
                operator: In
                values:
                - ssd
      containers:
      - name: nginx-container
        image: nginx:latest
        ports:
        - containerPort: 80

Creating a Deployment

kubectl apply -f

/nginx-deployment created

kubectl get pods -l app=nginx -o wide

You can see that all run on node01's node.

Anti-affinity (AFA)

define

Anti-affinity is another scheduling rule in Kubernetes that aims to avoid scheduling Pods to the same nodes as some specific Pods. It's like when choosing friends, there are certain people you just don't want to live with, even if they have a beautiful house.

use

Inverse affinity is often used to improve the availability and fault tolerance of an application. For example, if you have multiple copies of a Pod and they are all running on the same node, all the copies will be affected when that node goes down. Inverse affinity ensures that they are distributed across different nodes, like putting your eggs in different baskets so that one basket doesn't fall and all your eggs are lost.

typical example

Here is a YAML file that uses anti-affinity rules to ensure that the Nginx Pod does not run with an already existing Nginx Pod:

# 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - nginx
            topologyKey: "/hostname"
      containers:
      - name: nginx-container
        image: nginx:latest
        ports:
        - containerPort: 80

Creating a Deployment

kubectl apply -f

The result after creation is shown below

kubectl get po -l app=nginx -o wide

You can see that there are already two pods running, on node01 and node02 respectively, and the other is in peding.

Find out why it's in pending

kubectl describe pod nginx-deployment-5675f7647f-vgkrl

kubectl describe pod nginx-deployment-5675f7647f-vgkrl  

Name:             nginx-deployment-5675f7647f-vgkrl
Namespace:        default
Priority:         0
Service Account:  default
Node:             <none>
Labels:           app=nginx
                  pod-template-hash=5675f7647f
Annotations:      <none>
Status:           Pending
IP:               
IPs:              <none>
Controlled By:    ReplicaSet/nginx-deployment-5675f7647f
Containers:
  nginx-container:
    Image:        nginx:latest
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:
      /var/run/secrets//serviceaccount from kube-api-access-qwjc2 (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  kube-api-access-qwjc2:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 /not-ready:NoExecute op=Exists for 300s
                             /unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  2m7s  default-scheduler  
  0/3 nodes are available: 1 node(s) had untolerated taint {/control-plane: }, 
  2 node(s) didn't match pod anti-affinity rules. 
  preemption: 0/3 nodes are available: 1 Preemption is not helpful for scheduling, 
  2 No preemption victims found for incoming pod.

You can see the information on the last edge, indicating that there are no available nodes.

summarize

With affinity and disaffinity, Kubernetes can schedule Pods precisely according to your needs, acting like a good party planner and ensuring that each Pod "socializes" on the right node. This not only improves resource utilization, but also enhances application stability. Next time you deploy, don't forget these little secrets!

IV. Stigma and Tolerance: How to Make Your Pod Less "Petulant"

Taints

Definitions and uses

A taint, like a "no entry" sign, tells certain Pods, "Hey, stay away, I don't want to play with you!" What it does is allow nodes in Kubernetes to mark special requirements so that only the "right" Pods can run there.

For example, if you have a powerful node that needs to do some intense computation tasks, you can add a "taint" to the node so that ordinary Pods won't accidentally come in and take up resources.

typical example

kubectl taint nodes  key=value:NoSchedule

In this example, we give the node Added a file calledkey=value of the stain and set it toNoScheduleThis means that Kubernetes will not schedule any Pods to this node unless the Pod has the "superpower" to tolerate the taint. This means that Kubernetes will not schedule any Pods to this node unless the Pod has the "super power" to tolerate the taint.

Let's create a deployment and test to see if it still dispatches to node01.

#  
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3 
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

Creating a Deployment

kubectl apply -f

Viewing pod information

kubectl get pods -l app=nginx -o wide

You can see that the pods are all running on node02, as expected, indicating that the taint is in effect and the default is not to dispatch to node01.

Tolerations

Definitions and uses

Tolerance is the Pod's "pass" that allows it to ignore "no-go" signs (taints) on nodes, which the Pod must bring with it if it wants to move into a node marked by a taint. If a pod wants to enter a node marked with a taint, it must bring this "pass" with it. It's like this: some nodes are very picky, but some Pods are very tolerant and say "It's OK, I can take it".

How to use with stains

Tolerance works by allowing Pods to still schedule properly on tainted nodes. It's like the relationship between a doorman and a VIP card - the doorman won't let just anyone in, but the Pod with the VIP card can say, "I've got tolerance, I can get through!"

Example:

# 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      tolerations:
      - key: "key"
        operator: "Equal"
        value: "value"
        effect: "NoSchedule"
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

Explanation:

apiVersion: apps/v1 indicates that we created theDeployment resources in Kubernetes.Deployment Used to manage the set of replicas (i.e., multiple Pods) of an application.
replicas: defines what we want to createnginx The number of copies of the Pod, here we set it to2The result is that there will be twonginx Pod。
selector: matchLabels pass (a bill or inspection etc)app: nginx The tab selects the Pod that needs to be managed.
template: Defines the Pod template to be deployed, including tolerance settings and container specifications.
tolerations: Configure thisnginx Deployment's Pod can tolerate specifickey=value:NoSchedule tainted nodes, allowing them to be dispatched to nodes with that taint.
containers: Containers within a Pod, here we usenginx:latest image and exposes port 80.

This configuration deploys twonginx Pod and allows them to be dispatched to nodes with specific taints (based on the configuredtolerations）。

Creating a Deployment

kubectl get po -l app=nginx -o wide

In this example Pod, thetolerations configuration, allowing it to ignore thekey=value:NoSchedule Such a taint runs smoothly to the node with the "taint".

Taint vs. tolerance, who's the boss of scheduling?

To summarize, taints are the gatekeepers on a node, preventing unrelated pods from messing with it, while tolerance is the VIP card for a pod, allowing it to ignore the gatekeepers and move in without any problems. The two complement each other and are an integral part of the Kubernetes scheduling mechanism.

With these settings, you can effectively control which Pods can run on which nodes. Remember: With "passes", you won't have to worry about nodes being "nasty"!

V. Prioritization and Preemption: Kubernetes' "Big Brother" Scheduling Strategy

In the Kubernetes world, Pods are no longer equal. Yes, there are "big boys" and "little boys"! It all depends on Priority and Preemption. Let's take a look at how these two mysterious forces affect the fate of Pods!

Priority: Who's the Big Guy?

define：

Prioritization is the mechanism by which pods are ranked, and Kubernetes allows you to give each pod a level of importance that determines whether it will be prioritized when resources are tight, or whether it will sit in the queue in obscurity.

How to set up：

We need to start by defining aPriorityClassand then add theDeployment It's like giving Nginx a "VIP card". It's like giving Nginx a "VIP card".

# 
apiVersion: scheduling./v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000
globalDefault: false
description: "High priority for important Nginx pods."

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      priorityClassName: high-priority
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

Explanation: We created a file namedhigh-priority This means that these Nginx Pods are given special treatment in resource scheduling and are allocated first when resources are tight.

Creating a Deployment

kubectl apply -f

Viewing pod information

kubectl get po -l app=nginx -o wide

kubectl describe pod nginx-deployment-646cb6c499-6k864

Preemption: Can Nginx "kick people out" too?

define: Preemption is like giving Nginx the privilege of telling Kubernetes to allow a high-priority Pod to take the place of a low-priority Pod if resources are tight. In this way, Nginx can gracefully "kick out" someone else at a critical time, and get in on the ground floor itself.

typical example：

# 
apiVersion: scheduling./v1
kind: PriorityClass
metadata:
  name: vip-priority
value: 2000
preemptionPolicy: PreemptLowerPriority
globalDefault: false
description: "VIP Nginx pods with preemption power."

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment-vip
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-vip
  template:
    metadata:
      labels:
        app: nginx-vip
    spec:
      priorityClassName: vip-priority
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

account for: Here, we have a new version forvip-priority The Nginx Pod class is given "preemption" privileges. If there are insufficient resources, the system will force a lower-priority pod to "go away" so that the Nginx VIP can take its place.

establish

 kubectl apply -f

Viewing pod information

Summary:

By setting priorities, we can give some important Nginx Pods scheduling privileges, and by preempting them, they can even drive away lower-priority Pods and take up valuable resource slots. In this way, Nginx can not only run faster on the server, but also "preempt" scheduling policies!

VI. Scheduling Policy: Who decides where Nginx Pods live?

The Kubernetes scheduler is the "real estate agent" responsible for finding homes for Pods, and it has to see where they are available and comfortable, while also considering global balance. It has to look at where there are plenty of places to live, and it also has to consider global balance, and there's actually a strategy behind which nodes a pod will be assigned to run on. Here's a deep dive into some common scheduling strategies to see how Nginx Pods find their new homes.

Polling scheduling (Round Robin): everyone takes turns, fairness first!

layman's explanation：
It's like when you're eating hot pot, the ingredients come off the pot one by one, and no one gets left behind. Polling scheduling traverses the nodes in order, and evenly distributes the Pod to each node. Nodes A, B, C are divided into work, will not let a node busy death, while other nodes idle mold.

give an example：
Nginx Pod 1 is given to node A, Pod 2 is given to node B, Pod 3 is given to node C, and A gets another turn next.

vantage：simple and fair, which is particularly suitable for the case of relatively balanced resources and can avoid overloading a particular node.
drawbacks: Taking turns is not the same as "being smart", and may add to the stress of scheduling more tasks for nodes that are about to fill up.

Random scheduling (Random): chance, Pod and node depends on fate!

layman's explanation：
The scheduler throws the dice and picks a node at random to schedule the Nginx Pod. this strategy is "mood", random assignment, not according to the rules of the game, and where the Pod lives is all up to fate.

give an example：
There are three nodes A, B, and C. Nginx Pod 1 may be given to A, Pod 2 is randomly assigned to C, and Pod 3 is then given to B. Each selection is random.

vantageSimplicity and directness can occasionally lead to unexpected "luck".
drawbacks: Too random! Sometimes this leads to some nodes being overused while others don't have much to do.

Resource-based scheduling: resources come first, prioritizing "big houses"!

layman's explanation：
The scheduler looks at the housing stock - CPU, memory, and other resources - before dividing the house. Resource-based scheduling is like picking a house with luxury amenities to live in, and Nginx Pods are prioritized on the nodes with the most resources and the most space.

give an example：
If node A has 60% of its resources left, node B has only 10% left, and node C has 80% left, the scheduler chooses to schedule the Pod at node C, which is the most resource-rich node, so that it can live most comfortably.

vantage: Smart! Can use resources wisely and avoid waste.
drawbacks: Slightly more complexity may lead to increased scheduling time, especially when there are many nodes.

Other scheduling strategies, personalized services are readily available!

In addition to polling, randomization, and resource prioritization, which are common scheduling policies, Kubernetes has a number of more "personalized" ways to meet different needs:

Spread scheduling (Spread): Try to spread the pods across different nodes, so that all Nginx pods are not concentrated on a single node. If this node hangs, it will be embarrassing if the pods are all wiped out.
Compact scheduling (Binpack)This strategy is the opposite of decentralized scheduling: it tries to cram as many Pods as possible into a single node to fill up the resources, maximize the use of space, and pack the house like a suitcase to make it more compact.

To summarize:

The Kubernetes scheduler is not just a "mover", it's more like an intelligent intermediary that finds the best home for a pod based on different scheduling policies. Whether it's polled scheduling for fair distribution, random scheduling for luck, or "mansion first" for resources, the scheduler will always help Nginx pods find a comfortable home. And if you have special needs, personalized strategies such as distributed scheduling and compact scheduling are always available to make your cluster more efficient and stable.

The strategy you choose depends on how you organize the "moving" plan!

Volume Affinity: The "Fatal Connection" between Pods and Storage Volumes!

Definitions and uses：
Do you know the relationship between Pod and Volume? They are like "soul mates". In the Kubernetes world.Volume Affinity It's all about making sure that the Pod stays with the volumes it needs most, without separation or power loss. This mechanism keeps volumes and Pods close to each other, like installing a "GPS" on your Pods to make sure their stored data is within reach.

To give a layman's example：
Imagine your Pod is like a barista, and the storage volume is its bean warehouse. volume Affinity ensures that the barista doesn't set up store where there are no beans - it's like setting up a warehouse before you set up store! This eliminates the need for Pods to worry about fetching data from far away storage, and makes it easy and efficient to serve everything close to home.

Example: Nginx and its proprietary "storage silo".

Let's use the familiar Nginx as an example. Let's say we want the Nginx Pod to be close to its storage volumes, and Volume Affinity helps arrange this "closeness".

# 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        volumeMounts:
        - name: nginx-storage
          mountPath: /usr/share/nginx/html
      volumes:
      - name: nginx-storage
        persistentVolumeClaim:
          claimName: nginx-pvc
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - nginx
            topologyKey: "/hostname"

Explain.：

volumeMounts cap (a poem)volumes section defines the relationship between the Pod and its storage volume (which is the bag of coffee beans).
podAffinity Ensure that the Pod is closer to the storage volume by setting thetopologyKeyThe Pod is prioritized to be located on the same node as the storage volume.

brief summary：
Volume Affinity is like assigning a storage bodyguard to the Pod, so the Pod no longer has to worry about fetching data from offsite storage. This not only improves performance, but also reduces the number of "long-distance relationships" between storage and compute, after all, storage and compute still need to have more contact to be sweet!

In summary: the scheduling mechanism, the "master resource scheduler" of Kubernetes!

Let's take a look back at the various scheduling mechanisms in Kubernetes! Just like in a perfect chorus, every note has to fall exactly where it's supposed to, Kubernetes scheduling makes sure that your pods are organized in a way that makes the best use of every compute resource.

Node Selector
This is Kubernetes' "housing agent", making sure that your Pods live in the right neighborhood of nodes. For example, if you say, "I want a node for SSD storage," Kubernetes will get you there right away. Simple, brutal, and to the point.

Affinity & Anti-affinity
Like a circle of friends, you can give your Pod good neighbors or keep it away from unsuitable "exes". Affinity ensures that it stays with its favorite nodes, while anti-affinity prevents it from hanging out with Pods that it doesn't like.

Taints & Tolerations
This is the "starter guard" for Kubernetes, some nodes may not be suitable for regular Pods to "bother", but some Pods are "well trained" and they are free to pass through with their Some pods are "well-trained" and have a tolerance that allows them to pass freely, which makes the cluster more organized.

Priority & Preemption
When resources are tight, it's like a VIP lane at the airport gate. Some Pods have a higher priority, and they are prioritized for the best "seats". The preemption mechanism pushes some regular passengers to the back to make way for urgent tasks.

scheduling strategy
The scheduling policy acts as an "event planner" for your cluster, and can schedule Pods by resource demand, polling, randomization, or whatever strategy you design to ensure that every resource point in the cluster is kept busy.

Volume Affinity
It's like having a "coffee bean warehouse" close at hand for your Pod, ensuring that its storage volumes are within easy reach, reducing "off-site transfers" and increasing efficiency!

Reaffirmation: Movement control puts everything in order

Kubernetes scheduling mechanisms aren't just for looks - they're resource optimization powerhouses! Whether it's improving node utilization, prioritizing critical tasks, or bringing storage and compute closer together, all of these mechanisms share a common goal - to make your applications run faster, your resources work better, and your cluster management easier.

The Kubernetes scheduling mechanism is so smart and distributive that your Pods are placed in the best possible places as if they had "god-teammates". And you, as the mastermind of your cluster, can comfortably watch these mechanisms take care of everything!