Deploy an Internal Load Balancer with Oracle Cloud Native Environment
Introduction
Oracle Cloud Native Environment is a fully integrated suite for the development and management of cloud native applications. The Kubernetes module is the core module. It is used to deploy and manage containers and also automatically installs and configures CRI-O, runC and Kata Containers. CRI-O manages the container runtime for a Kubernetes cluster. The runtime may be either runC or Kata Containers.
Objectives
This tutorial/lab demonstrates how to:
- Configure the Kubernetes cluster with an internal load balancer to enable high availability
- Configure Oracle Cloud Native Environment on a 5-node cluster
- Verify keepalived failover between the control plane nodes completes successfully
Support Note: Using the internal load balancer is NOT recommended for production deployments. Instead, please use a correctly configured (external) load-balancer.
Prerequisites
The host systems to perform the steps in this tutorial are listed in this section. You need:
6 Oracle Linux systems to use as:
- Operator node (ocne-operator)
- 3 Kubernetes control plane nodes (ocne-control01, ocne-control02, ocne-control03 )
- 2 Kubernetes worker nodes (ocne-worker01, ocne-worker02)
Note: In a production environment, it is recommended that you have a cluster with at least five control plane nodes and at least three worker nodes.
A virtual IP address for the primary control plane node. This IP address should not be in use on any node, and is assigned dynamically to the control plane node assigned as the primary controller by the load balancer.
Oracle Support Disclaimer: If you are deploying to Oracle Cloud Infrastructure, your tenancy requires enabling a new feature introduced in OCI: Layer 2 Networking for VLANs within your virtual cloud networks (VCNs). The OCI Layer 2 Networking feature is not generally available, although the tenancy for the free lab environment does have this feature enabled.
If you have a use case, please work with your technical team to get your tenancy listed to use this feature.Each system should have a minimum of the following installed:
- Latest Oracle Linux 8 (x86_64) installed and running the Unbreakable Enterprise Kernel Release 6 (UEK R6)
This environment is pre-configured with the following:
- Created an
oracle
user account (used during the install) - Granted the
oracle
accountsudo
access - Set up key-based SSH, also known as passwordless SSH, between the instances
- Oracle Cloud Native Environment Release 1.5 installed (but no environment created)
- VLAN created and IPv4 addresses assigned
- Created an
Set Up Lab Environment
Note: When using the free lab environment, see Oracle Linux Lab Basics for connection and other usage instructions.
This lab involves multiple systems, each of which requires different steps to be performed. It is recommended to start by opening a terminal window or tab and connecting to each node. This avoids you having to repeatedly log in and out. The nodes are:
- ocne-operator
- ocne-control01
- ocne-control02
- ocne-control03
- ocne-worker01
- ocne-worker02
Open a terminal and connect via ssh to each of the nodes.
ssh oracle@<ip_address_of_ol_node>
Set Firewall Rules on Control Plane Nodes
(On all control plane nodes) Set the firewall rules and enable the Virtual Router Redundancy Protocol (VRRP) protocol for all of the control plane nodes.
sudo firewall-cmd --add-port=6444/tcp --zone=public --permanent
sudo firewall-cmd --add-protocol=vrrp --zone=public --permanent
sudo firewall-cmd --reload
Note: This must be completed before proceeding to ensure the load balancer process can communicate between the control plane nodes.
Create a Platform CLI Configuration File
Administrators can use a configuration file to simplify creating and managing environments and modules. The configuration file, written in valid YAML syntax, includes all information about the environments and modules to create. Using a configuration file saves repeated entries of Platform CLI command options.
Note: If more than one control plane node is entered in the myenvironment.yaml used to configure the Oracle Cloud Native Environment, then olcnectl requires that details of the virtual IP address are entered into the myenvironment.yaml file. Achieve this via entering a new argument (
virtual-ip: <enter-your-ip-here>
) into the myenvironment.yaml, for example:
Example Output:
[oracle@ocne-operator ~]$ cat myenvironment.yaml environments: - environment-name: myenvironment globals: api-server: 127.0.0.1:8091 secret-manager-type: file olcne-ca-path: /etc/olcne/configs/certificates/production/ca.cert olcne-node-cert-path: /etc/olcne/configs/certificates/production/node.cert olcne-node-key-path: /etc/olcne/configs/certificates/production/node.key modules: - module: kubernetes name: mycluster args: container-registry: container-registry.oracle.com/olcne virtual-ip: 10.0.12.111 master-nodes: ocne-control01:8090,ocne-control02:8090,ocne-control03:8090 worker-nodes: ocne-worker01:8090,ocne-worker02:8090 selinux: enforcing restrict-service-externalip: true restrict-service-externalip-ca-cert: /etc/olcne/configs/certificates/restrict_external_ip/production/ca.cert restrict-service-externalip-tls-cert: /etc/olcne/configs/certificates/restrict_external_ip/production/node.cert restrict-service-externalip-tls-key: /etc/olcne/configs/certificates/restrict_external_ip/production/node.key
During lab deployment, a configuration file is automatically generated and ready to use in the exercise. More information on manually creating a configuration file is in the documentation at Using a Configuration File .
Update the Configuration File
(On ocne-operator) View the configuration file contents.
cat ~/myenvironment.yaml
(On ocne-operator) Add the virtual-ip value to the myenvironment.yaml file.
sed -i '14i\ virtual-ip: 10.0.12.111\' ~/myenvironment.yaml
(On ocne-operator) Confirm the virtual-ip value has been added to the myenviroment.yaml file.
cat ~/myenvironment.yaml
Create the Environment and Kubernetes Module
(On ocne-operator) Create the environment.
cd ~ olcnectl environment create --config-file myenvironment.yaml
Example Output:
[oracle@ocne-operator ~]$ olcnectl environment create --config-file myenvironment.yaml Environment myenvironment created.
(On ocne-operator) Create the Kubernetes module.
olcnectl module create --config-file myenvironment.yaml
Example Output:
[oracle@ocne-operator ~]$ olcnectl module create --config-file myenvironment.yaml Modules created successfully.
(On ocne-operator) Validate the Kubernetes module.
olcnectl module validate --config-file myenvironment.yaml
Example Output:
[oracle@ocne-operator ~]$ olcnectl module validate --config-file myenvironment.yaml Validation of module mycluster succeeded.
In this example, there are no validation errors. If there are any errors, the commands required to fix the nodes are provided as output of this command.
(On ocne-operator) Install the Kubernetes module.
olcnectl module install --config-file myenvironment.yaml
The deployment of Kubernetes to the nodes may take several minutes to complete.
Example Output:
[oracle@ocne-operator ~]$ olcnectl module install --config-file myenvironment.yaml Modules installed successfully.
(On ocne-operator) Validate the deployment of the Kubernetes module.
olcnectl module instances --config-file myenvironment.yaml
Example Output:
[oracle@ocne-operator ~]$ olcnectl module instances --config-file myenvironment.yaml INSTANCE MODULE STATE 10.0.12.11:8090 node installed 10.0.12.12:8090 node installed 10.0.12.13:8090 node installed 10.0.12.21:8090 node installed 10.0.12.22:8090 node installed mycluster kubernetes installed [oracle@ocne-operator ~]$
Set up kubectl
(On all control plane nodes) Set up the
kubectl
command.mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config export KUBECONFIG=$HOME/.kube/config echo 'export KUBECONFIG=$HOME/.kube/config' >> $HOME/.bashrc
(On any control plane node) Verify
kubectl
works, and that the install completed successfully with all nodes listed as being in the 'Ready status.kubectl get nodes
Example Output:
[oracle@ocne-control01 ~]$ kubectl get nodes NAME STATUS ROLES AGE VERSION ocne-control01 Ready control-plane,master 9m3s v1.23.7+1.el8 ocne-control02 Ready control-plane,master 7m47s v1.23.7+1.el8 ocne-control03 Ready control-plane,master 6m41s v1.23.7+1.el8 ocne-worker01 Ready <none> 8m32s v1.23.7+1.el8 ocne-worker02 Ready <none> 8m26s v1.23.7+1.el8 [oracle@ocne-control01 ~]$
Confirm Failover Between Control Plane Nodes
The Oracle Cloud Native Environment installation with three control plane nodes behind an internal load balancer is complete.
The following steps confirm that the internal load balancer (using keepalived
) will detect when the primary control plane node fails and passes control to one of the surviving control plane nodes. Likewise, when the 'missing' node recovers, it can automatically rejoin the cluster.
Locate the Primary Control Plane Node
Determine which control plane node currently holds the virtual IP address.
(On all control plane nodes) List the present network devices and IP addresses.
ip -br a
The
-br
option passed to theip
command provides a brief summary of the IP address information assigned to a device.Look at each of the results, and find which host's output contains the virtual IP associated with the
ens5
NIC.In the free lab environment, the virtual IP address used by the
keepalived
daemon is set to 10.0.12.111.ens5 UP 10.0.12.11/24 10.0.12.111/32 fe80::41f6:2a0d:9a89:13d0/64
Example Output:
[oracle@ocne-control01 ~]$ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 ens3 UP 10.0.0.150/24 fe80::200:17ff:fe00:215e/64 ens5 UP 10.0.12.11/24 10.0.12.111/32 fe80::41f6:2a0d:9a89:13d0/64 flannel.1 UNKNOWN 10.244.0.0/32 fe80::78ad:2dff:fef1:259b/64 cni0 UP 10.244.0.1/24 fe80::104e:f2ff:fece:343b/64 veth9f7ab71b@if2 UP fe80::545c:e3ff:fee9:2809/64 [oracle@ocne-control01 ~]$
This control plane node has the virtual IP address assigned as a secondary IP on the
ens5
device.Example Output:
[oracle@ocne-control02 ~]$ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 ens3 UP 10.0.0.151/24 fe80::17ff:fe00:e0b9/64 ens5 UP 10.0.12.12/24 fe80::feab:426b:206e:8870/64 flannel.1 UNKNOWN 10.244.1.0/32 fe80::68ba:77ff:fedc:f56f/64 cni0 UP 10.244.1.1/24 fe80::a449:7fff:feb2:9738/64 veth3729db25@if2 UP fe80::b07d:f8ff:feb4:4830/64 vethd7048746@if2 UP fe80::ec1b:49ff:fe90:838a/64
This control plane node does not.
Important: Take note which control plane node currently holds the virtual IP address.
(Optional) (On all control plane nodes) Confirm which control plane node holds the virtual IP address by querying the
keepalived.service
logs.journalctl -u keeaplived
Example Output:
... Aug 10 23:47:26 ocne-control01 Keepalived_vrrp[55605]: (VI_1) Entering MASTER STATE ...
This control plane node has the virtual IP address assigned.
Example Output:
... Aug 10 23:54:59 ocne-control02 Keepalived_vrrp[59961]: (VI_1) Entering BACKUP STATE (init) ...
This control plane node does not.
Force the keepalived Daemon to Move to a Different Control Plane Node
If using the free lab environment, double-click the Luna Lab icon on the desktop and navigate to the Luna Lab tab. Then click on the OCI Console hyperlink. Sign-on using the User Name and Password values provided. After logging on, proceed with the following steps:
Start from this screen.
Click on the hamburger menu (top-left), then Compute and Instances.
This displays the Instances page.
Locate the Compartment used from the drop-down list as instructed in Oracle Linux Lab Basics .
Click on the Instance name previously identified (for example, ocne-control01).
This shows the details for the Instance.
Click on the Stop button.
In the pop-up dialog, select the 'Force stop the instance by immediately powering off' check-box, and click the 'Force stop instance` button.
Note: Do NOT do this on a production system as it may incur data loss, corruption or worse to the entire system.
Important Information related to Terminal session: Note that it will no longer be possible to use the Terminal session associated with the specific control plane node once it has been shut down.
Wait until the Instance details page confirms the instance is 'Stopped'.
Switch from the browser, back to the Terminal.
(On one of the surviving control plane nodes) Execute the following command to confirm that the control plane node just shut down is reporting as NotReady.
Note: It may be necessary to repeat this step several times until the status changes
kubectl get nodes
Example Output:
[oracle@ocne-control03 ~]$ kubectl get nodes ocne-control01 NotReady control-plane,master 3h26m v1.23.7+1.el8 ocne-control02 Ready control-plane,master 3h12m v1.23.7+1.el8 ocne-control03 Ready control-plane,master 3h9m v1.23.7+1.el8 ocne-worker01 Ready <none> 3h9m v1.23.7+1.el8 ocne-worker02 Ready <none> 3h9m v1.23.7+1.el8 [oracle@ocne-control03 ~]$
(On each of the surviving control plane nodes) Determine which node has the running
keepalived
daemon.ip -br a
Example Output:
[oracle@ocne-control03 ~]$ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 ens3 UP 10.0.0.152/24 fe80::200:17ff:fe00:6af4/64 ens5 UP 10.0.12.13/24 10.0.12.111/32 fe80::7c90:5770:5da:a00/64 flannel.1 UNKNOWN 10.244.2.0/32 fe80::70a8:6ff:fefd:16f7/64 [oracle@ocne-control03 ~]$
The virtual IP is now associated with ocne-control03.
Switch back to the browser.
In the Cloud Console, start the Instance previously shut down by clicking on the Start button.
Wait until the Status section confirms the Instance is 'Running'.
Switch back to the Terminal.
Reconnect to the control plane node that was previously shut down.
(On any of the control plane nodes) Confirm that
kubectl
shows the restarted control plane node as being Ready.kubectl get nodes
Note: It may be necessary to repeat this step several times until the status changes.
Example Output:
[oracle@ocne-control03 ~]$ kubectl get nodes NAME STATUS ROLES AGE VERSION ocne-control01 Ready control-plane,master 3h28m v1.23.7+1.el8 ocne-control02 Ready control-plane,master 3h26m v1.23.7+1.el8 ocne-control03 Ready control-plane,master 3h24m v1.23.7+1.el8 ocne-worker01 Ready <none> 3h24m v1.23.7+1.el8 ocne-worker02 Ready <none> 3h23m v1.23.7+1.el8 [oracle@ocne-control03
(On all control plane nodes) Confirm the location of the active
keepalived
daemon.Note: The
keepalived
daemon remained with the currently active host, despite the original host restarting. This occurs due to how Oracle Cloud Native Environment weights the nodes in thekeepalived
configuration.ip -br a
Example Output:
[oracle@ocne-control03 ~]$ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 ens3 UP 10.0.0.152/24 fe80::200:17ff:fe00:6af4/64 ens5 UP 10.0.12.13/24 10.0.12.111/32 fe80::7c90:5770:5da:a00/64 flannel.1 UNKNOWN 10.244.2.0/32 fe80::70a8:6ff:fefd:16f7/64
Example Output:
[oracle@ocne-control01 ~]$ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 ens3 UP 10.0.0.150/24 fe80::200:17ff:fe00:215e/64 ens5 UP 10.0.12.11/24 fe80::41f6:2a0d:9a89:13d0/64 flannel.1 UNKNOWN 10.244.0.0/32 fe80::e4a0:25ff:fea3:b636/64