Use Kubectl to Manage Kubernetes Clusters and Nodes on Oracle Cloud Native Environment
Introduction
Although graphical tools can manage Kubernetes, many administrators prefer to use command-line tools. The command line tool provided within the Kubernetes ecosystem is called kubectl . Kubectl is a versatile tool used to deploy and inspect the configurations and logs of the cluster resources and applications. Kubectl achieves this by using the Kubernetes API to authenticate with the control Node of the Kubernetes cluster to complete any management actions requested by the administrator.
Most of the operations/commands available for kubectl provide administrators with the ability to deploy and manage applications deployed onto the Kubernetes cluster and inspect and manage the Kubernetes cluster resources.
Note: Many kubectl commands have the
--all-namespaces
option appended. For this reason, a shorthand for this option is the-A
flag. This tutorial useskubectl -A
instead ofkubectl --all-namespaces
in preference.
Objectives
This tutorial builds on the basic commands introduced in Introducing Kubectl with Oracle Cloud Native Environment . If this is your first encounter using kubectl, you may find it beneficial to start there. This tutorial introduces how kubectl can manage individual Kubernetes Nodes and any application(s) deployed onto them. The specific areas of Node management introduced in this tutorial are:
- Querying Cluster Information
- Querying Node information
- Deploying an example application (Nginx)
- Introducing new concepts such as:
- Cordoning/Uncordoning and Draining Nodes
- Taints and Tolerations
This tutorial only uses kubectl
to view the current configuration information.
Prerequisites
Minimum of a 5-node Oracle Cloud Native Environment cluster:
- Operator node
- Kubernetes control plane node
- 3 Kubernetes worker nodes
Each system should have Oracle Linux installed and configured with:
- An Oracle user account (used during the installation) with sudo access
- Key-based SSH, also known as password-less SSH, between the hosts
- Installation of Oracle Cloud Native Environment
Deploy Oracle Cloud Native Environment
Note: If running in your own tenancy, read the linux-virt-labs
GitHub project README.md and complete the prerequisites before deploying the lab environment.
Open a terminal on the Luna Desktop.
Clone the
linux-virt-labs
GitHub project.git clone https://github.com/oracle-devrel/linux-virt-labs.git
Change into the working directory.
cd linux-virt-labs/ocne
Install the required collections.
ansible-galaxy collection install -r requirements.yaml
Update the Oracle Cloud Native Environment configuration.
cat << EOF | tee instances.yaml > /dev/null compute_instances: 1: instance_name: "ocne-operator" type: "operator" 2: instance_name: "ocne-control-01" type: "controlplane" 3: instance_name: "ocne-worker-01" type: "worker" 4: instance_name: "ocne-worker-02" type: "worker" 5: instance_name: "ocne-worker-03" type: "worker" EOF
Deploy the lab environment.
ansible-playbook create_instance.yaml -e ansible_python_interpreter="/usr/bin/python3.6" -e "@instances.yaml"
The free lab environment requires the extra variable
ansible_python_interpreter
because it installs the RPM package for the Oracle Cloud Infrastructure SDK for Python. The location for this package's installation is under the python3.6 modules.Important: Wait for the playbook to run successfully and reach the pause task. The Oracle Cloud Native Environment installation is complete at this stage of the playbook, and the instances are ready. Take note of the previous play, which prints the public and private IP addresses of the nodes it deploys.
Review Existing Cluster and Node Information
An essential precursor to administering any Kubernetes cluster is discovering what nodes are present, the pods executing on those nodes, and so on. This action allows you to plan and anticipate temporarily disabling pod scheduling on nodes while they undergo any required maintenance or troubleshooting.
Open a terminal and connect via ssh to the ocne-control node.
ssh oracle@<ip_address_of_ol_node>
Query a complete list of existing nodes.
kubectl get nodes
Note that the output returns a list of all the deployed nodes with status details and the Kubernetes version.
Request more details about one of the nodes.
kubectl describe node <your-preferred-node-name>
This command returns a wealth of information related to the Kubernetes node, starting with the following:
Name:
confirms the Kubernetes node nameLabels:
key/value pairs used to identify object attributes relevant to end-usersAnnotations:
key/value pairs used to store extra information about a Kubernetes nodeUnschedulable: false
indicates the node accepts any deployed pod
Example Output:
[oracle@ocne-control-01 ~]$ kubectl describe node ocne-worker-01 Name: ocne-worker-01 Roles: <none> Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux failure-domain.beta.kubernetes.io/zone=EU-FRANKFURT-1-AD-1 kubernetes.io/arch=amd64 kubernetes.io/hostname=ocne-worker-01 kubernetes.io/os=linux oci.oraclecloud.com/fault-domain=FAULT-DOMAIN-2 topology.kubernetes.io/zone=EU-FRANKFURT-1-AD-1 Annotations: alpha.kubernetes.io/provided-node-ip: 10.0.0.160 csi.volume.kubernetes.io/nodeid: {"blockvolume.csi.oraclecloud.com":"ocne-worker-01","fss.csi.oraclecloud.com":"ocne-worker-01"} flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"3a:c1:fe:56:38:93"} flannel.alpha.coreos.com/backend-type: vxlan flannel.alpha.coreos.com/kube-subnet-manager: true flannel.alpha.coreos.com/public-ip: 10.0.0.160 kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/crio/crio.sock node.alpha.kubernetes.io/ttl: 0 oci.oraclecloud.com/compartment-id: ocid1.compartment.oc1..aaaaaaaau2g2k23u6mp3t43ky3i4ky7jpyeiqcdcobpbcb7z6vjjlrdnuufq volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Mon, 14 Aug 2023 11:05:34 +0000 Taints: <none> Unschedulable: false ... ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Starting 12m kube-proxy Normal NodeHasSufficientMemory 12m (x8 over 12m) kubelet Node ocne-worker-01 status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 12m (x8 over 12m) kubelet Node ocne-worker-01 status is now: NodeHasNoDiskPressure Normal RegisteredNode 12m node-controller Node ocne-worker-01 event: Registered Node ocne-worker-01 in Controller Normal Starting 5m18s kubelet Starting kubelet. Normal NodeHasSufficientMemory 5m18s kubelet Node ocne-worker-01 status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 5m18s kubelet Node ocne-worker-01 status is now: NodeHasNoDiskPressure Normal NodeHasSufficientPID 5m18s kubelet Node ocne-worker-01 status is now: NodeHasSufficientPID Normal NodeNotReady 5m18s kubelet Node ocne-worker-01 status is now: NodeNotReady Normal NodeAllocatableEnforced 5m18s kubelet Updated Node Allocatable limit across pods Normal NodeReady 5m18s kubelet Node ocne-worker-01 status is now: NodeReady Normal RegisteredNode 5m18s node-controller Node ocne-worker-01 event: Registered Node ocne-worker-01 in Controller
NOTE: Don't clear the output from your terminal because the next steps highlight some areas of interest in this output.
This output excerpt shows the internal and external IPs, internal DNS name, and hostname assigned to the node.
Example Output (excerpt):
Addresses: InternalIP: 10.0.0.160 ExternalIP: 130.61.232.251 Hostname: ocne-worker-01 Capacity: cpu: 2 ephemeral-storage: 37177616Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 32568508Ki pods: 110
A little further down the output, the
Non-terminated pods:
section shows what pods are running on the node with details of CPU and memory requests and any limits per pod.Example Output (excerpt):
Non-terminated Pods: (3 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age --------- ---- ------------ ---------- --------------- ------------- --- kube-system csi-oci-node-87qf5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 6m48s kube-system kube-flannel-ds-cpgnk 100m (5%) 100m (5%) 50Mi (0%) 50Mi (0%) 12m kube-system kube-proxy-jfw2w 0 (0%) 0 (0%) 0 (0%) 0 (0%) 12m
Finally, the
Allocated resources:
section outlines any resources assigned to the node.Example Output (excerpt):
Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 200m (10%) 200m (10%) memory 70Mi (0%) 80Mi (0%) ephemeral-storage 0 (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%)
In summary, the
kubectl describe node
command is very useful in providing the administrator with a wealth of information about a Kubernetes node that can assist with planning or troubleshooting deployments.
Deploy Nginx
Currently, no applications are deployed to the three worker nodes, which makes demonstrating the effects of any Node management commands more difficult. The following steps will deploy an Nginx pod onto each worker node.
Generate a deployment file to create Nginx Deployments onto each of the three worker nodes.
cat << EOF | tee ~/deployment.yaml > /dev/null apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: selector: matchLabels: app: nginx replicas: 3 # tells deployment to run 3 pods matching the template template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.24.0 ports: - containerPort: 80 EOF
Where:
name: nginx-deployment
represents the deployment's namereplicas: 3
represents the number of pods to deployimage: nginx:1.24.0
represents the Nginx version the Kubernetes pod will deploy
Note: There are numerous possibilities for describing a deployment in its associated YAML file. For more detail, refer to the upstream documentation .
Deploy Nginx onto the three OCNE worker Nodes.
kubectl create -f ~/deployment.yaml
The
-f
switch indicates which YAML file to use and its location.Confirm Nginx deployed across all three worker Nodes.
kubectl get pods --namespace default -o wide
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get pods --namespace default -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-5c46dbdf89-5qnsw 1/1 Running 0 86m 10.244.3.2 ocne-worker-01 <none> <none> nginx-deployment-5c46dbdf89-j6dcl 1/1 Running 0 86m 10.244.3.3 ocne-worker-02 <none> <none> nginx-deployment-5c46dbdf89-lcjv6 1/1 Running 0 86m 10.244.2.4 ocne-worker-03 <none> <none>
(Optional) It is also possible to review what Pods deployed onto a single Node, for ocne-worker-01.
kubectl get pods --field-selector spec.nodeName=ocne-worker-01 -o wide
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get pods --field-selector spec.nodeName=ocne-worker-01 -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-5c46dbdf89-5qnsw 1/1 Running 0 89m 10.244.3.2 ocne-worker-01 <none> <none>
Using these options is helpful in busy Kubernetes environments where there are potentially many nodes, each hosting many deployments, to assist in planning maintenance operations.
Cordoning a Node
Like anything, Kubernetes nodes occasionally require maintenance. This maintenance may involve replacing physical hardware and updating the Node's Operating System or Kernel. Cordons and drains are two mechanisms that facilitate the safe preparation of the target node to ensure that any applications deployed on the node do not affect the end user's experience when using those applications.
Note: Manually using the
kubectl cordon
orkubectl uncordon
commands to upgrade the Kubernetes environment with Oracle Cloud Native Environment is not supported. Theolcnectl module update
command, when used with a new Kubernetes version, traverses the deployed nodes invoking thekubectl cordon
andkubectl drain
commands followed by the required steps to process the in-place upgrade before finally issuing thekubectl uncordon
command, all in sequence to smoothly upgrade the Kubernetes cluster without incurring any application outages.
Identify existing nodes.
kubectl get nodes
Cordon one of the worker nodes.
kubectl cordon ocne-worker-01
Example Output:
[oracle@ocne-control-01 ~]$ kubectl cordon ocne-worker-01 node/ocne-worker-03 cordoned
Confirm the node is 'cordoned'.
kubectl get nodes
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get nodes NAME STATUS ROLES AGE VERSION ocne-control-01 Ready control-plane 21m v1.28.3+3.el8 ocne-worker-01 Ready,SchedulingDisabled <none> 20m v1.28.3+3.el8 ocne-worker-02 Ready <none> 21m v1.28.3+3.el8 ocne-worker-03 Ready <none> 20m v1.28.3+3.el8
Notice that the cordoned Node now lists with the
SchedulingDisabled
status. Subsequently, no new applications will deploy to that node, and any existing applications will continue to service current sessions.
Draining a Node
Before undertaking any maintenance on the newly cordoned node, any pods deployed onto that node need to be removed/evicted. The drain command gracefully terminates the pod's containers. Once the drain command is complete, it is safe to complete whatever actions have been planned for that node, for example, scheduled maintenance such as upgrading the Operating System or the Kernel.
Drain the
ocne-worker-01
node.kubectl drain ocne-worker-01 --delete-emptydir-data --ignore-daemonsets --force
Note: Why is the
--ignore-daemonset --force
command needed? Adding these command options overrides the Daemonset controller's default pod management behavior of replacing any closed pods with a new instance of the equivalent pod while also evicting all pods on the node.Example Output:
[oracle@ocne-control-01 ~]$ kubectl drain ocne-worker-01 --delete-emptydir-data --ignore-daemonsets --force node/ocne-worker-01 already cordoned Warning: ignoring DaemonSet-managed Pods: kube-system/csi-oci-node-87qf5, kube-system/kube-flannel-ds-cpgnk, kube-system/kube-proxy-jfw2w evicting pod default/nginx-deployment-5c46dbdf89-pdsp8 pod/nginx-deployment-5c46dbdf89-pdsp8 evicted node/ocne-worker-01 drained
Confirm Nginx is no longer running on ocne-worker-01.
kubectl get pods --namespace default -o wide
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get pods --namespace default -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-5c46dbdf89-75kk2 1/1 Running 0 111s 10.244.3.4 ocne-worker-03 <none> <none> nginx-deployment-5c46dbdf89-g58dr 1/1 Running 0 4m34s 10.244.3.3 ocne-worker-03 <none> <none> nginx-deployment-5c46dbdf89-nwhct 1/1 Running 0 4m34s 10.244.1.6 ocne-worker-02 <none> <none>
Note: Because the
deployment.yaml
file instructed Kubernetes to deploy three pods, notice there are still three pods listed in the output. This example shows the third pod has redeployed to ocne-worker-03. However, this may differ in your environment because the Kubernetes scheduler may decide differently.
Confirm the kubectl cordon
Command Works
Delete the existing Nginx deployment.
kubectl delete deployment nginx-deployment
Example Output:
[oracle@ocne-control-01 ~]$ kubectl delete deployment nginx-deployment deployment.apps "nginx-deployment" deleted
Confirm Nginx is no longer present.
kubectl get pods --namespace default -o wide
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get pods --namespace default -o wide No resources found in default namespace.
Deploy Nginx again.
kubectl create -f ~/deployment.yaml
Confirm no pods deploy onto a Cordoned worker node.
kubectl get pods --namespace default -o wide
Notice that pods do not deploy onto the 'cordoned' node, which is ocne-worker-01 in this example.
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get pods --namespace default -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-5c46dbdf89-bsf54 1/1 Running 0 7s 10.244.2.9 ocne-worker-03 <none> <none> nginx-deployment-5c46dbdf89-cm25f 1/1 Running 0 7s 10.244.3.6 ocne-worker-02 <none> <none> nginx-deployment-5c46dbdf89-qwtg7 1/1 Running 0 7s 10.244.2.8 ocne-worker-03 <none> <none>
Note: Which node displays with two pods may differ in your environment.
This output confirms that the
kubectl cordon
command issued to the ocne-worker-01 node has successfully prevented any pod from deploying onto it.
Uncordon the Node
Once the maintenance is complete, the previously cordoned node is returned to the pool using the kubectl uncordon
command.
Uncordon the ocne-worker-01 node.
kubectl uncordon ocne-worker-01
Example Output:
[oracle@ocne-control-01 ~]$ kubectl uncordon ocne-worker-01 node/ocne-worker-01 uncordoned
Confirm the ocne-worker-01 node is available for scheduling again.
kubectl get nodes
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get nodes NAME STATUS ROLES AGE VERSION ocne-control-01 Ready control-plane 11m v1.28.3+3.el8 ocne-worker-01 Ready <none> 10m v1.28.3+3.el8 ocne-worker-02 Ready <none> 10m v1.28.3+3.el8 ocne-worker-03 Ready <none> 10m v1.28.3+3.el8
Note: The
SchedulingDisabled
flag under the STATUS column has been removed, confirming the ocne-worker-01 has been returned to the pool.
(Optional) Confirm the Uncordoned Node is Available Again
Note: Please be aware that the following steps are not required on a live system because they would cause an application outage if used. Under normal circumstances, the Kubernetes scheduler determines where individual Pods are deployed based on many things, such as the overall load across the cluster. Therefore, Kubernetes Scheduler is free to schedule pods wherever it wants, evict pods whenever it wants, etc. The steps provided here are for illustrative purposes to demonstrate that once a node is uncordoned it is available for Kubernetes to schedule pods whenever the Kubernetes Scheduler chooses.
Delete the existing Nginx deployment.
kubectl delete deployment nginx-deployment
Deploy Nginx again.
kubectl create -f ~/deployment.yaml
Confirm Nginx has deployed across all three worker Nodes.
kubectl get pods --namespace default -o wide
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get pods --namespace default -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-5c46dbdf89-6wrd6 1/1 Running 0 8s 10.244.3.7 ocne-worker-02 <none> <none> nginx-deployment-5c46dbdf89-w55wt 1/1 Running 0 8s 10.244.2.10 ocne-worker-03 <none> <none> nginx-deployment-5c46dbdf89-x8vwz 1/1 Running 0 8s 10.244.1.7 ocne-worker-01 <none> <none>
Notice that, as expected, all worker nodes now host an Nginx pod confirming, the node is available for deployments again.
Introducing Taints and Tolerations
Managing where application pods deploy within a Kubernetes cluster is an essential and skilled aspect of being a Kubernetes administrator. Effective scheduling management can help companies improve the efficient use of their resources, control costs, and manage applications at scale across a cluster. This section does not provide in-depth coverage of all aspects of this complicated aspect of Kubernetes administration. Instead, it introduces Taints
and Tolerations
and how they can aid an administrator in their role.
What are Taints and Tolerations?
Taints represent a Kubernetes property assigned to nodes to repel certain pods when deployed onto the cluster. Tolerations, on the other hand, represent a property linked to an application as part of the deployment.yaml
file indicating this pod, or pods, can be scheduled against any node having a matching taint. Setting these properties is how a Kubernetes administrator can exercise some control over which nodes an application's pods deploy onto.
Are they guaranteed to work?
Taints and tolerations help to repel pods from deploying to a defined node. However, they cannot ensure that a specific pod deploys to a predetermined node. Instead, another advanced scheduling technique Kubernetes administrators use, which, when used together with taints and tolerations, can control where pods deploy to execute. This additional technique is known as node affinity
but is outside the scope of this tutorial.
Why use Taints and Tolerations?
The simplest way to view the most common use cases behind why an administrator would choose to use taints and tolerations includes the following:
Configure dedicated nodes: Taints and tolerations, combined with a node affinity definition, can help ensure matching pods deploy to these nodes.
Indicate nodes with special hardware: When a pod requires specific hardware to be present to either run or run most efficiently, using taints and tolerations allows administrators to ensure the most relevant pods will be scheduled onto these nodes by the Kubernetes scheduler process.
Eviction of pods based on node conditions: If an administrator assigns a taint to a node that already has pods assigned to it, then existing pods that do not possess a 'toleration' will eventually be automatically evicted from that node by the Kubernetes Scheduler.
Review Existing Taints across all Nodes
Before making any changes to the existing nodes, the administrator must establish whether any taints definitions apply across all the existing nodes.
Confirm what taints exist currently across all of the nodes.
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,TaintEffect:.spec.taints[*].effect
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get nodes -o=custom-columns=NodeName:.metadata.name,TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,TaintEffect:.spec.taints[*].effect NodeName TaintKey TaintValue TaintEffect ocne-control-01 node-role.kubernetes.io/control-plane <none> NoSchedule ocne-worker-01 <none> <none> <none> ocne-worker-02 <none> <none> <none> ocne-worker-03 <none> <none> <none>
Notice that only the ocne-control-01 node has an existing taint assigned to it, which is
node-role.kubernetes.io/control-plane:NoSchedule
, and prevents pods from deploying onto the ocne-control-01 node itself.Note: The
node-role.kubernetes.io/control-plane:NoSchedule
setting automatically applied by thekubeadm
tool while it is performing the actions to bootstrap the Kubernetes control-plane node.
Apply a Taint to a Worker Node
Taints are very similar to labels applied to any number of nodes within the Kubernetes cluster. The Kubernetes scheduler will only schedule pods to execute on the node where the deployment.yaml
file defines tolerations, thus allowing them to be on that node.
Taints are applied to a node using a command in the format: kubectl taint nodes nodename key1=value1:taint-effect
. The command declares a taint as a key-value pair, where value1 is the key and taint-effect represents the value. There are three taint effects available which are:
The
NoSchedule
or strong effect taint instructs the Kubernetes scheduler to allow only newly deployed pods possessing tolerations to execute on this node. Any existing pods will continue executing unaffected.The
PreferNoSchedule
or soft effect taint instructs the Kubernetes scheduler to try to avoid scheduling newly deployed pods on this node unless they have a toleration.The
NoExecute
taint instructs the Kubernetes scheduler to evict any running pods from the node unless they have tolerations for the tainted node.
Delete the existing Nginx deployment.
kubectl delete deployment nginx-deployment
Example Output:
[oracle@ocne-control-01 ~]$ kubectl delete deployment nginx-deployment deployment.apps "nginx-deployment" deleted
Apply a
NoExecute
taint to the ocne-worker-01 node.kubectl taint nodes ocne-worker-01 app=nginx:NoSchedule
Example Output:
[oracle@ocne-control-01 ~]$ kubectl taint nodes ocne-worker-01 app=nginx:NoSchedule node/ocne-worker-01 tainted
Where:
- Key matches the
app: nginx
line defined in thespec:
section of thedeployment.yaml
file. - value matches the taint to be applied (
NoSchedule
)
- Key matches the
Confirm the taint has applied to the ocne-worker-01 node.
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,TaintEffect:.spec.taints[*].effect
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get nodes -o=custom-columns=NodeName:.metadata.name,TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,TaintEffect:.spec.taints[*].effect NodeName TaintKey TaintValue TaintEffect ocne-control-01 node-role.kubernetes.io/control-plane <none> NoSchedule ocne-worker-01 app nginx NoSchedule ocne-worker-02 <none> <none> <none> ocne-worker-03 <none> <none> <none>
Where:
- The
Taintkey
column shows the valueapp
for ocne-worker-01 - The
TaintValue
column shows the valuenginx
for ocne-worker-01
- The
Deploy Nginx again.
kubectl create -f ~/deployment.yaml
Confirm no pods have deployed onto the tainted ocne-worker-01 node.
kubectl get pods --namespace default -o wide
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get pods --namespace default -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-5c46dbdf89-kt9g6 1/1 Running 0 8s 10.244.2.6 ocne-worker-02 <none> <none> nginx-deployment-5c46dbdf89-twh86 1/1 Running 0 8s 10.244.2.5 ocne-worker-02 <none> <none> nginx-deployment-5c46dbdf89-xd5jr 1/1 Running 0 8s 10.244.3.6 ocne-worker-03 <none> <none>
Notice that, as expected, no pods have been deployed onto the cone-worker-01 node, demonstrating the effect of the
NoSchedule
taint.
Define a Toleration in a Deployment File
The next step is to use a deployment file containing a toleration that allows deploying new pods to a node containing a taint.
The first step is to delete the existing Nginx deployment.
kubectl delete deployment nginx-deployment
Create a new deployment file to deploy Nginx to the cluster.
cat << EOF | tee ~/deployment-toleration.yaml > /dev/null apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: selector: matchLabels: app: nginx replicas: 3 # tells deployment to run 3 pods matching the template template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.24.0 ports: - containerPort: 80 tolerations: - key: "app" operator: "Equal" value: "nginx" effect: "NoSchedule" EOF
Where the toleration is described in the last section of the deployment descriptor, as shown below:
tolerations: - key: "app" operator: "Equal" value: "nginx" effect: "NoSchedule"
Remember that the taint declared earlier was this
app=nginx:NoSchedule
where:- The
key
used was app - The
value
used was nginx
That taint matches the values defined in the toleration section of the deployment file's YAML. The toleration uses the
Equal
operator to ensure the taint value matches the toleration. If this matches, the Kubernetes Scheduler can deploy the pod onto the node. If not, the Kubernetes scheduler will not deploy the pod onto the node.- The
Deploy the Nginx using the
deployment-toleration.yaml
file.kubectl create -f ~/deployment-toleration.yaml
Example Output:
[oracle@ocne-control-01 ~]$ kubectl create -f ~/deployment-toleration.yaml deployment.apps/nginx-deployment created
Confirm the deployment used all three worker nodes.
kubectl get pods --namespace default -o wide
Example Output:
[oracle@ocne-control-01 ~]$ kubectl get pods --namespace default -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-86d6d8c585-8jsxq 1/1 Running 0 8s 10.244.3.6 ocne-worker-03 <none> <none> nginx-deployment-86d6d8c585-g65xw 1/1 Running 0 8s 10.244.1.3 ocne-worker-01 <none> <none> nginx-deployment-86d6d8c585-j7czg 1/1 Running 0 8s 10.244.2.6 ocne-worker-02 <none> <none>
Notice the output confirms the
nginx-deployment-86d6d8c585-xxxxx
pod deployed on all three ocne-workerxx nodes. Confirming thetoleration
has enabled the Kubernetes scheduler to deploy the pods across all three available nodes.For now, this serves only as an introduction to the wide variety of options available to a Kubernetes administrator to fine-tune the deployment of applications and broader maintenance of the Kubernetes cluster for which they are responsible. This ability will eventually become one of several tools the administrator will use to manage and troubleshoot the Kubernetes cluster for which they're responsible.
Summary
This concludes this very brief introduction to how an experienced administrator can use the kubectl command-line tool to manage Pod Scheduling operations on their Kubernetes cluster. The examples introduced here provide a basic introduction to the general scheduling principles on a Kubernetes cluster. If you want to learn more, please refer to the official documentation for more details.