Migrate Oracle Linux Automation Manager to a Clustered Deployment

1
1
Send lab feedback

Migrate Oracle Linux Automation Manager to a Clustered Deployment

Introduction

Whether upgrading from a previous release or starting with a single host installation, both environments can migrate to a clustered deployment. Administrators need to plan their topology before migrating, as the cluster may consist of a combination of Control Plane and Execution Plane nodes and a remote database.

The following tutorial provides instructions for migrating a single host Release installation to a clustered deployment with a remote database.

Objectives

In this lab, you'll learn how to:

  • Setup a remote database
  • Migrate to a clustered deployment

Prerequisites

Verify Release Installation

Note: When using the free lab environment, see Oracle Linux Lab Basics for connection and other usage instructions.

Information: The free lab environment deploys a running Oracle Linux Automation Manager installation. The deployment takes approximately 15-20 minutes to finish after launch. Therefore, you might want to step away while this runs and promptly return to complete the lab.

  1. Open a terminal and configure an SSH tunnel to the deployed Oracle Linux Automation Manager instance.

    The Oracle Linux Automation Manager deploys in the free lab environment to the control-node VM host.

    ssh -L 8444:localhost:443 oracle@<hostname or ip address>
  2. Open a web browser and enter the URL.

    https://localhost:8444

    Note: Approve the security warning based on the browser used. For Chrome, click the Advanced button and then the Proceed to localhost (unsafe) link.

  3. Login to Oracle Linux Automation Manager with the Username admin and the Password admin created during the automated deployment.

    olam-login

  4. After login, the WebUI displays.

    olam-webui

Migrate to a Cluster Deployment

While Oracle Linux Automation Manager Release runs as a single host deployment, it also supports running as a cluster with a remote database and separate control plane and execution plane nodes. The installation configures the single-host instance as a hybrid node. The first step in migrating to a cluster deployment is converting this instance to a control plane node.

For more information on different installation topologies, see the Planning the Installation chapter of the Oracle Linux Automation Manager Installation Guide documentation .

Prepare the Control Plane Node

  1. Switch to the terminal connected to the control-node instance running Oracle Linux Automation Manager.

    Note: From now on, we'll refer to this instance as the control plane node.

  2. Stop the Oracle Linux Automation Manager service.

    sudo systemctl stop ol-automation-manager
  3. Create a backup of the database.

    sudo su - postgres -c 'pg_dumpall > /tmp/olamv2_db_dump'

Install the Remote Database

  1. Copy the database backup from the control plane node to the new remote database host.

    scp /tmp/olamv2_db_dump oracle@10.0.0.160:/tmp/

    The address of 10.0.0.160 is the internal IP address of the remote database host defined in the free lab environment. This connection is possible due to the free lab environment configuring passwordless ssh logins between the instances.

  2. Open a new terminal and connect via ssh to the remote-db instance.

    Use the external IP address reference on the Luna Lab Resources page. Direct connections using the above-referenced internal IP address of 10.0.0.160 are not possible from the Luna Desktop.

    ssh oracle@<hostname or ip address>
  3. Enable the database module stream.

    Oracle Linux Automation Manager allows the use of Postgresql database version 12 or 13. We'll use and enable the version 13 module stream in this lab environment.

    sudo dnf -y module reset postgresql
    sudo dnf -y module enable postgresql:13
    
  4. Install the database server.

    sudo dnf -y install postgresql-server
  5. Add the database firewall rule.

    sudo firewall-cmd --add-port=5432/tcp --permanent
    sudo firewall-cmd --reload
    
  6. Initialize the database.

    sudo postgresql-setup --initdb
  7. Set the database default storage algorithm.

    sudo sed -i "s/#password_encryption.*/password_encryption = scram-sha-256/"  /var/lib/pgsql/data/postgresql.conf

    If interested in more details regarding this database functionality, see Password Authentication in the upstream documentation.

  8. Update the database host-based authentication file.

    echo "host  all  all 0.0.0.0/0 scram-sha-256" | sudo tee -a /var/lib/pgsql/data/pg_hba.conf > /dev/null

    This additional line performs SCRAM-SHA-256 authentication to verify a user's password when connecting from any IP address.

  9. Update the IP address on which the database listens.

    sudo sed -i "/^#port = 5432/i listen_addresses = '"$(hostname -i)"'" /var/lib/pgsql/data/postgresql.conf
  10. Start and enable the database service.

    sudo systemctl enable --now postgresql
  11. Import the database dump file.

    sudo su - postgres -c 'psql -d postgres -f /tmp/olamv2_db_dump'
  12. Set the password for the Oracle Linux Automation Manager database user account.

    sudo su - postgres -c "psql -U postgres -d postgres -c \"alter user awx with password 'password';\""

    This command sets the awx password to password. Choose a more secure password if running this command outside the free lab environment.

  13. Close the terminal window connected to remote-db, as that completes the necessary steps to set up the remote database.

Add the Remote Database Settings

  1. Switch back to the control plane node terminal running on the control-node instance and reconnect if necessary.

  2. Add the remote database settings to a new custom configuration file.

    cat << EOF | sudo tee /etc/tower/conf.d/db.py > /dev/null
    DATABASES = {
        'default': {
            'ATOMIC_REQUESTS': True,
            'ENGINE': 'awx.main.db.profiled_pg',
            'NAME': 'awx',
            'USER': 'awx',
            'PASSWORD': 'password',
            'HOST': '10.0.0.160',
            'PORT': '5432',
        }
    }
    EOF

    Use the same password set previously for the awx database user account.

  3. Stop and disable the local database on the control plane node.

    sudo systemctl stop postgresql
    sudo systemctl disable postgresql
    
  4. Start Oracle Linux Automation Manager.

    sudo systemctl start ol-automation-manager
  5. Verify connection to the new remote database.

    sudo su -l awx -s /bin/bash -c "awx-manage check_db"

    The output returns the remote database version details if a connection is successful.

Remove the Local Database Instance

Removing the original local database is safe after confirming the connection to the remote database is working.

  1. Remove the database packages.

    sudo dnf -y remove postgresql
  2. Remove the pgsql directory containing the old database data files.

    sudo rm -rf /var/lib/pgsql

Change the Node Type of the Control Plane Node

When converting to a clustered deployment, switch the single-host instance node_type from hybrid to control.

  1. Confirm the current node type of the control plane node.

    sudo su -l awx -s /bin/bash -c "awx-manage list_instances"

    The output shows the node_type set to a value of hybrid.

  2. Remove the default instance group.

    sudo su -l awx -s /bin/bash -c "awx-manage remove_from_queue --queuename default --hostname $(hostname -i)"
  3. Define the new instance and queue.

    sudo su -l awx -s /bin/bash -c "awx-manage provision_instance --hostname=$(hostname -i) --node_type=control"
    sudo su -l awx -s /bin/bash -c "awx-manage register_queue --queuename=controlplane --hostnames=$(hostname -i)"
    
  4. Add the default queue name values in the custom settings file.

    cat << EOF | sudo tee -a /etc/tower/conf.d/olam.py > /dev/null
    DEFAULT_EXECUTION_QUEUE_NAME = 'execution'
    DEFAULT_CONTROL_PLANE_QUEUE_NAME = 'controlplane'
    EOF
  5. Update Receptor settings.

    cat << EOF | sudo tee /etc/receptor/receptor.conf > /dev/null
    ---
    - node:
        id: $(hostname -i)
    
    - log-level: info
    
    - tcp-listener:
        port: 27199
    
    - control-service:
        service: control
        filename: /var/run/receptor/receptor.sock
    
    - work-command:
        worktype: local
        command: /var/lib/ol-automation-manager/venv/awx/bin/ansible-runner
        params: worker
        allowruntimeparams: true
        verifysignature: false
    EOF
  6. Restart Oracle Linux Automation Manager

    sudo systemctl restart ol-automation-manager

The conversion of the single-host hybrid node to a control plane node with a remote database is complete. Now we'll add an execution plane node to make this cluster fully functional.

Add an Execution Plane Node to the Cluster

Before the cluster is fully functional, add one or more execution nodes. Execution nodes run standard jobs using ansible-runner, which runs playbooks within an OLAM EE Podman container-based execution environment.

Prepare the Execution Plane Node

  1. Open a new terminal and connect via ssh to the execution-node instance.

    ssh oracle@<hostname or ip address>
  2. Install the Oracle Linux Automation Manager repository package.

    sudo dnf -y install oraclelinux-automation-manager-release-el8
  3. Disable the repository for the older release.

    sudo dnf config-manager --disable ol8_automation
  4. Enable the current release's repository.

    sudo dnf config-manager --enable ol8_automation2
  5. Install the Oracle Linux Automation Manager package.

    sudo dnf -y install ol-automation-manager
  6. Add the Receptor firewall rule.

    sudo firewall-cmd --add-port=27199/tcp --permanent
    sudo firewall-cmd --reload
    
  7. Edit the Redis socket configuration.

    sudo sed -i '/^# unixsocketperm/a unixsocket /var/run/redis/redis.sock\nunixsocketperm 775' /etc/redis.conf
  8. Copy the secret key from the control plane node.

    ssh oracle@10.0.0.150 "sudo cat /etc/tower/SECRET_KEY" | sudo tee /etc/tower/SECRET_KEY > /dev/null

    Important: Every cluster node requires the same secret key.

  9. Create a custom settings file containing the required settings.

    cat << EOF | sudo tee /etc/tower/conf.d/olamv2.py > /dev/null
    CLUSTER_HOST_ID = '$(hostname -i)'
    DEFAULT_EXECUTION_QUEUE_NAME = 'execution'
    DEFAULT_CONTROL_PLANE_QUEUE_NAME = 'controlplane'
    EOF

    The CLUSTER_HOST_ID is a unique identifier of the host within the cluster.

  10. Create a custom settings file containing the remote database configuration.

    cat << EOF | sudo tee /etc/tower/conf.d/db.py > /dev/null
    DATABASES = {
        'default': {
            'ATOMIC_REQUESTS': True,
            'ENGINE': 'awx.main.db.profiled_pg',
            'NAME': 'awx',
            'USER': 'awx',
            'PASSWORD': 'password',
            'HOST': '10.0.0.160',
            'PORT': '5432',
        }
    }
    EOF
  11. Deploy the ansible-runner execution environment.

    1. Open a shell as the awx user.

      sudo su -l awx -s /bin/bash
    2. Migrate any existing containers to the latest podman version while keeping the unprivileged namespaces alive.

      podman system migrate
    3. Pull the Oracle Linux Automation Engine execution environment for Oracle Linux Automation Manager.

      podman pull container-registry.oracle.com/oracle_linux_automation_manager/olam-ee:latest
    4. Exit out of the awx user shell.

      exit
  12. Generate the SSL certificates for NGINX.

    sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/tower/tower.key -out /etc/tower/tower.crt

    Enter the requested information or just hit the ENTER key.

  13. Replace the default NGINX configuration with the configuration below.

    cat << EOF | sudo tee /etc/nginx/nginx.conf > /dev/null
    user nginx;
    worker_processes auto;
    error_log /var/log/nginx/error.log;
    pid /run/nginx.pid;
    
    # Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
    include /usr/share/nginx/modules/*.conf;
    
    events {
        worker_connections 1024;
    }
    
    http {
        log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                          '$status $body_bytes_sent "$http_referer" '
                          '"$http_user_agent" "$http_x_forwarded_for"';
    
        access_log  /var/log/nginx/access.log  main;
    
        sendfile            on;
        tcp_nopush          on;
        tcp_nodelay         on;
        keepalive_timeout   65;
        types_hash_max_size 2048;
    
        include             /etc/nginx/mime.types;
        default_type        application/octet-stream;
    
        # Load modular configuration files from the /etc/nginx/conf.d directory.
        # See http://nginx.org/en/docs/ngx_core_module.html#include
        # for more information.
        include /etc/nginx/conf.d/*.conf;
    }
    EOF
  14. Update the Receptor configuration file.

    cat << EOF | sudo tee /etc/receptor/receptor.conf > /dev/null
    ---
    - node:
        id: $(hostname -i)
    
    - log-level: debug
    
    - tcp-listener:
        port: 27199
    
    - tcp-peer:
        address: 10.0.0.150:27199
        redial: true
    
    - control-service:
        service: control
        filename: /var/run/receptor/receptor.sock
    
    - work-command:
        worktype: ansible-runner
        command: /var/lib/ol-automation-manager/venv/awx/bin/ansible-runner
        params: worker
        allowruntimeparams: true
        verifysignature: false
    EOF
    • node:id is the hostname or IP address of the current node.
    • tcp-peer:address is the Receptor mesh's hostname or IP address and port on the control plane node.
  15. Start and enable the Oracle Linux Automation Manager service.

    sudo systemctl enable --now ol-automation-manager.service

Provision the Execution Plane Node

  1. Switch to the terminal connected to the control plane node running on the control-node instance.

    The provisioning step must be run on one of the control plane nodes within the cluster and applies to all clustered instances of Oracle Linux Automation Manager.

  2. Define the execution instance and queue.

    sudo su -l awx -s /bin/bash -c "awx-manage provision_instance --hostname=10.0.0.151 --node_type=execution"
    sudo su -l awx -s /bin/bash -c "awx-manage register_default_execution_environments"
    sudo su -l awx -s /bin/bash -c "awx-manage register_queue --queuename=execution --hostnames=10.0.0.151"
    
    • register_queue takes a queuename to create/update and a list of comma-delimited hostnames where jobs run.
  3. Register the service mesh peer relationship.

    sudo su -l awx -s /bin/bash -c "awx-manage register_peers 10.0.0.151 --peers $(hostname -i)"

Verify the Execution Plane Node Registration

  1. Switch to the terminal connected to the execution node running on the execution-node instance.

  2. Verify the Oracle Linux Automation Manager mesh service is running.

    sudo systemctl status receptor-awx
  3. Check the status of the service mesh.

    sudo receptorctl  --socket /var/run/receptor/receptor.sock status

    Example Output:

    [oracle@execution-node ~]$ sudo receptorctl  --socket /var/run/receptor/receptor.sock status
    Node ID: 10.0.0.151
    Version: +g
    System CPU Count: 2
    System Memory MiB: 15713
    
    Connection   Cost
    10.0.0.150   1
    
    Known Node   Known Connections
    10.0.0.150   {'10.0.0.151': 1}
    10.0.0.151   {'10.0.0.150': 1}
    
    Route        Via
    10.0.0.150   10.0.0.150
    
    Node         Service   Type       Last Seen             Tags
    10.0.0.151   control   Stream     2022-11-06 19:46:53   {'type': 'Control Service'}
    10.0.0.150   control   Stream     2022-11-06 19:46:06   {'type': 'Control Service'}
    
    Node         Work Types
    10.0.0.151   ansible-runner
    10.0.0.150   local

    For more details about Receptor, see the upstream documentation .

  4. Verify the running cluster instances and show the available capacity.

    sudo su -l awx -s /bin/bash -c "awx-manage list_instances"

    The output appears green once the cluster establishes communication across all instances. If the results appear red, wait 20-30 seconds and try rerunning the command.

    Example Output:

    [oracle@control-node ~]$ sudo su -l awx -s /bin/bash -c "awx-manage list_instances"
    [controlplane capacity=136]
    	10.0.0.150 capacity=136 node_type=control version=19.5.1 heartbeat="2022-11-08 16:24:03"
    
    [default capacity=0]
    
    [execution capacity=136]
    	10.0.0.151 capacity=136 node_type=execution version=19.5.1 heartbeat="2022-11-08 17:16:45"
    

That completes the migration of Oracle Linux Automation Manager Release to a clustered deployment.

(Optional) Verify the Cluster is Working

  1. Refresh the web browser window used to display the previous WebUI, or open a new web browser window and enter the URL.

    https://localhost:8444

    The port used in the URL needs to match that of the SSH tunnel local port.

    Note: Approve the security warning based on the browser used. For Chrome, click the Advanced button and then the Proceed to localhost (unsafe) link.

  2. Login to Oracle Linux Automation Manager again with the USERNAME admin and the password admin.

    olam2-login

  3. After login, the WebUI displays.

    olam2-webui

  4. Using the navigation menu on the left, click Inventories under the Resources section.

    olam2-menu-inv

  5. In the main window, click the Add button and then select Add inventory.

    olam2-inventories

  6. On the Create new inventory page, enter the reqired information.

    olam2-new-inv

    For Instance Groups select the search icon to display the Select Instance Groups pop-up dialog. Click the checkbox next to the execution group and then click the Select button.

  7. Click the Save button.

  8. From the Details summary page, click the Hosts tab.

    olam2-inv-detail

  9. From the Hosts page, click the Add button.

    olam2-hosts

  10. On the Create new host page, enter the IP address or hostname of an available instance.

    olam2-new-host

    In the free lab environment, we'll use the IP address of 10.0.0.160 which is the internal IP address of the remote-db VM.

  11. Click the Save button.

  12. Navigate in the menu on the left, and click on Credentials.

    olam2-menu-creds

  13. On the Credentials page, click the Add button.

    olam2-credentials

  14. On the Create New Credential page, enter the required information.

    olam2-new-creds

    For the Credential Type click the drop-down menu and select Machine. That displays the credentials Type Details.

    Enter a Username of oracle and browse for the SSH Private Key. Clicking the Browse... button displays an Open File dialog window.

    olam2-open-file

    Right-click on the main window of that dialog and then select Show Hidden Files.

    olam2-show-hide

    Then select the .ssh folder and the id_rsa file. Clicking the Open button causes the contents of the private key file to copy into the SSH Private Key dialog box. Scroll down and click the Save button.

  15. Navigate in the menu on the left and click on Inventories.

    olam2-menu-inv

  16. From the Inventories page, click on the Test inventory.

    olam2-inv-test

  17. From the Details summary page, click the Hosts tab.

    olam2-inv-test-detail

  18. On the Hosts page, click the checkbox next to the 10.0.0.160 host.

    olam2-webui

    Then click the Run Command button.

  19. From the Run command dialog, select the ping module from the Modules list-of-values and click the Next button.

    olam2-webui

  20. Select the OLAM EE (latest) execution environment and click the Next button.

    olam2-webui

  21. Select the remote-db machine credential and click the Next button.

    olam2-webui

  22. A preview of the command will display.

    olam2-webui

    After reviewing the details, click the Launch button.

  23. The job will launch and display the job Output page.

    olam2-webui

    If everything ran successfully, the output shows a SUCCESS message that Oracle Linux Automation Manager execution plane node contacted the remote-db VM using the Ansible ping module.

For More Information

Oracle Linux Automation Manager Documentation
Oracle Linux Automation Manager Training
Oracle Linux Training Station

SSR