Use Taints and Tolerations with Oracle Cloud Native Environment
Introduction
The ability to influence the way Pods are scheduled to provide the best performance, reduce running costs, and make Kubernetes cluster management easier is an essential skill for an administrator to master. Taints and tolerations work with Node Affinity to attract Pods to a set of nodes. Taints and tolerations have the opposite effect by allowing nodes to repel Pods. Frequent use cases for taints and tolerations include:
- Identifying nodes with special hardware.
- The ability to dedicate nodes to specific application Pods.
- The ability to define custom conditions to evict a Pod from a node.
Taints allow the Kubernetes administrator to prevent unwanted Pods from executing on a predefined set of nodes. Tolerations allow any Pods to deploy onto a node with a matching taint. Together, these allow the administrator to fine-tune how Pods schedule to nodes.
Important: Taints and tolerations cannot ensure that a Pod schedules to a specific node. The Kubernetes scheduler can deploy a Pod onto any node without a taint that repels it. Instead, use node affinity when controlling where Pod scheduling is required.
Objectives
In this tutorial, you will learn:
- The difference between a taint and a toleration
- How to use taints and tolerations to influence application deployment on Oracle Cloud Native Environment
Prerequisites
- Installation of Oracle Cloud Native Environment
- a single control node and two worker nodes
Deploy Oracle Cloud Native Environment
Note: If running in your own tenancy, read the linux-virt-labs
GitHub project README.md and complete the prerequisites before deploying the lab environment.
Open a terminal on the Luna Desktop.
Clone the
linux-virt-labs
GitHub project.git clone https://github.com/oracle-devrel/linux-virt-labs.git
Change into the working directory.
cd linux-virt-labs/ocne2
Install the required collections.
ansible-galaxy collection install -r requirements.yml
Deploy the lab environment.
ansible-playbook create_instance.yml -e localhost_python_interpreter="/usr/bin/python3.6" -e install_ocne_rpm=true -e create_ocne_cluster=true -e "ocne_cluster_node_options='-n 1 -w 2'"
The free lab environment requires the extra variable
local_python_interpreter
, which setsansible_python_interpreter
for plays running on localhost. This variable is needed because the environment installs the RPM package for the Oracle Cloud Infrastructure SDK for Python, located under the python3.6 modules.The default deployment shape uses the AMD CPU and Oracle Linux 8. To use an Intel CPU or Oracle Linux 9, add
-e instance_shape="VM.Standard3.Flex"
or-e os_version="9"
to the deployment command.Important: Wait for the playbook to run successfully and reach the pause task. At this stage of the playbook, the installation of Oracle Cloud Native Environment is complete, and the instances are ready. Take note of the previous play, which prints the public and private IP addresses of the nodes it deploys and any other deployment information needed while running the lab.
Confirm the Number of Nodes
It helps to know the number and names of nodes in your Kubernetes cluster.
Open a terminal and connect via SSH to the ocne instance.
ssh oracle@<ip_address_of_node>
List the nodes in the cluster.
kubectl get nodes
The output confirms both worker nodes are in a Ready state.
Define and Apply a Taint
Taints are a node-specific property that prevents Pods without matching toleration for those taints from being scheduled to run on that node. So, how do taints and tolerations differ?
Taints
Taints can be applied to a node using kubectl taint
using this syntax: kubectl taint nodes <node name> <taint key>:<taint effect>
where:
<node name>
is the node receiving the taint<taint key>
is a user-defined label used to identify the applied taint<taint effect>
This can be one of three possible values:NoSchedule
- The Kubernetes scheduler will only schedule Pods that have tolerations to this nodePreferNoSchedule
- Kubernetes scheduler tries to avoid scheduling Pods without matching toleration to this nodeNoExecute
- The Kubernetes scheduler will evict all running Pods from this node that do not have a toleration
Tolerations
Define tolerations in the application Pod's PodSpec using two different formats: the Equal Operator or the Exists Operator, depending on requirements. See below for an example of the PodPSec syntax to use:
Equal Operator
tolerations:
- key: "<taint key>"
operator: "Equal"
value: "<taint value>"
effect: "<taint effect>"
Exists Operator
tolerations:
- key: "<taint key>"
operator: "Exists"
effect: "<taint effect>"
The Equal operator means the Pod will only schedule onto a node that exactly matches the taint on a node. On the other hand, the Exists operator accepts any value because it only requires a specific taint to be defined on the node to schedule a Pod to run on that node. You can apply multiple taints and tolerations to both nodes and Pods. This approach acts like a filter, providing more flexibility over how Pods schedule to a cluster's nodes.
Have you ever wondered why Pods do not deploy to the control plane node? This behavior occurs because Kubernetes applies a taint to it by default.
Display any taints applied to the Kubernetes cluster.
kubectl describe nodes | grep -i taints
Example Output:
Taints: node-role.kubernetes.io/control-plane:NoSchedule Taints: <none> Taints: <none>
Display a detailed view of the taints applied to the Kubernetes cluster nodes.
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,TaintKey:spec.taints[*].key,TaintValue:/spec.taints[*].value,TaintEffect:.spec.taints[*].effect
Example Output:
NodeName TaintKey TaintValue TaintEffect ocne-control-1 node-role.kubernetes.io/control-plane <none> NoSchedule ocne-worker-1 <none> <none> <none> ocne-worker-2 <none> <none> <none>
Notice that the control plane node has a NoSchedule taint applied with the key name node-role.kubernetes.io/control-plane:NoSchedule, which is one of three possible taint effects that can be assigned to a node. This taint prevents application Pods from being scheduled to run on the control node, but only if the Pod does not have a matching toleration. The worker nodes, on the other hand, have no taints applied.
Apply a NoSchedule taint to the worker nodes.
kubectl taint node ocne-worker-1 dedicated=test-taint:NoSchedule kubectl taint node ocne-worker-2 dedicated=test-taint:NoSchedule
Where the taint's Key, Value, and Effect are:
- Key = dedicated
- Value = test-taint
- Effect = NoSchedule
You can only apply taints to a single node at a time. However, using labels on nodes, you can taint multiple nodes simultaneously.
Confirm the taint applied to both worker nodes.
kubectl describe nodes | grep -i taints
Example Output:
Taints: node-role.kubernetes.io/control-plane:NoSchedule Taints: dedicated=test-taint:NoSchedule Taints: dedicated=test-taint:NoSchedule
Applying a taint is not retrospective, which means that if any applications without a toleration were already running on the cluster, they will continue to do so (unless the administrator evicts them).
Apply a Toleration to a Pod
Next, you will create a Pod with a toleration for a previously created taint to the worker nodes.
Create a Pod template to use.
kubectl run nginx-test --image=ghcr.io/oracle/oraclelinux9-nginx:1.20 --dry-run=client -o yaml > nginx-test.yaml
Remove the status: {} line from the generated deployment manifest file.
sed -i '$ d' nginx-test.yaml
Add the toleration to the template.
The generated template needs to include a toleration definition so the scheduler can deploy it onto a node with a matching taint.
cat << EOF | tee -a nginx-test.yaml > /dev/null tolerations: - key: "dedicated" value: "test-taint" effect: "NoSchedule" EOF
Review the template file just created.
cat nginx-test.yaml
Example Output:
apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: nginx-test name: nginx-test spec: containers: - image: nginx name: nginx-text resources: {} dnsPolicy: ClusterFirst restartPolicy: Always tolerations: - key: "dedicated" value: "test-taint" effect: "NoSchedule"
Run the deployment manifest file.
kubectl create -f nginx-test.yaml
Check which node Nginx deployed to.
kubectl get pods -n default -o wide
Since we only specify one Nginx Pod in the YAML file, the actual node the scheduler chooses will be between ocne-worker-1 or ocne-worker-2. If the status does not display the first time, retry querying the status a couple of times until it shows as Running.
Deploy an Application Without Configuring a Toleration
Applying the NoSchedule taint across the Kubernetes cluster means no nodes are available for new deployments without a matching toleration.
Deploy Nginx.
kubectl run nginx --image=ghcr.io/oracle/oraclelinux9-nginx:1.20
Confirm the Pod status.
kubectl get pods
Example Output:
NAME READY STATUS RESTARTS AGE nginx 0/1 Pending 0 16s nginx-test 1/1 Running 0 15m
Notice that the STATUS shows this deployment as having a Pending status.
Investigate why Nginx shows a status of Pending.
kubectl get events | grep nginx
Example Output:
6m26s Normal Scheduled pod/nginx-test Successfully assigned default/nginx-test to ocne-worker-1 6m26s Normal Pulling pod/nginx-test Pulling image "nginx" 6m19s Normal Pulled pod/nginx-test Successfully pulled image "nginx" in 6.268s (6.268s including waiting) 6m19s Normal Created pod/nginx-test Created container nginx-test 6m19s Normal Started pod/nginx-test Started container nginx-test 5m38s Warning FailedScheduling pod/nginx 0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 node(s) had untolerated taint {dedicated: test-taint}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.
The output shows the successful deployment of the nginx-test Pod, followed by the nginx Pod in Pending status. Reviewing the log information for the nginx Pod confirms that Kubernetes cannot schedule the Pod because there are no workers without a taint available.
Remove the NoSchedule Taint From a Node
You can remove taints by specifying the taint key and its taint effect with a minus (-) sign appended using the following syntax:
kubectl taint nodes <node name> <taint key>:<taint effect>-
Taints also have to be removed individually.
Remove the previously applied taint from one of the available nodes. This example removes the taint from the node that currently has no Pods deployed on it.
kubectl taint nodes <choose-a-node-with-no-deployments> dedicated=test-taint:NoSchedule-
It does not matter which node has the taint removed, but removing the taint from a currently unused node may be more illustrative. Therefore, replace the
with an unused node. From here on, the example output represents a scenario where the ocne-worker-1 node has the taint removed. Confirm the removal of the taint.
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,TaintKey:spec.taints[*].key,TaintValue:/spec.taints[*].value,TaintEffect:.spec.taints[*].effect
Example Output:
NodeName TaintKey TaintValue TaintEffect ocne-control-1 node-role.kubernetes.io/control-plane <none> NoSchedule ocne-worker-1 <none> <none> <none> ocne-worker-2 dedicated <none> NoSchedule
Confirm that Nginx deploys.
kubectl get pods
Example Output:
NAME READY STATUS RESTARTS AGE nginx 1/1 Running 0 3m55s nginx-test 1/1 Running 0 8m29s
Check to which node Nginx deploys.
kubectl get pods -n default -o wide
Example Output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx 1/1 Running 0 22m 10.244.1.4 ocne-worker-1 <none> <none> nginx-test 1/1 Running 0 36m 10.244.2.3 ocne-worker-2 <none> <none>
Recheck the event log.
kubectl get events | grep nginx
Example Output:
37m Normal Scheduled pod/nginx-test Successfully assigned default/nginx-test to ocne-worker-2 37m Normal Pulling pod/nginx-test Pulling image "nginx" 36m Normal Pulled pod/nginx-test Successfully pulled image "nginx" in 6.332s (6.332s including waiting) 36m Normal Created pod/nginx-test Created container nginx-test 36m Normal Started pod/nginx-test Started container nginx-test 117s Warning FailedScheduling pod/nginx 0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 node(s) had untolerated taint {dedicated: test-taint}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.. 34s Normal Scheduled pod/nginx Successfully assigned default/nginx to ocne-worker-1 34s Normal Pulling pod/nginx Pulling image "nginx" 27s Normal Pulled pod/nginx Successfully pulled image "nginx" in 6.661s (6.661s including waiting) 27s Normal Created pod/nginx Created container nginx 27s Normal Started pod/nginx Started container nginx
This output confirms that removing the taint from ocne-worker-2 means that the Nginx Pod deployed successfully onto the ocne-worker-2 node.
Next Steps
Once mastered, taints and tolerations are flexible methods for administrators to repel Pods from specific nodes. They are especially powerful when combined with node affinity to fine-tune the Kubernetes scheduler. The most effective way to use them is to keep them both short and straightforward to avoid overcomplicating their effect. Also, remember that taints and tolerations are not the correct way to ensure Pods get scheduled to a specific node. Instead, they should be used together with node affinity to achieve this behavior.
That concludes our walkthrough of introducing taints and toleration and how to use them to provide flexibility in managing deployments.