Monitor system resources on Oracle Linux

Introduction

In this tutorial you work with the Oracle Linux vmstat, mpstat, and top to monitor system resource usage. Monitoring the usage of system resourses is useful for detecting issues which can adversely affect system performance.

Background

Oracle Linux provides tools for monitoring and analyzing system resource usage, as well as tracing tools for diagnosing performance issues in multiple processes and related threads.

Objectives

Explore the vmstat command
Explore the mpstat command
Explore the top command
Use vmstat and mpstat, and top
Examine command output from vmstat, mpstat, and top

What Do You Need?

A client system with Oracle Linux installed.

Note: When using the free lab environment, see Oracle Linux Lab Basics for connection and other usage instructions.

Explore and Use `vmstat` Command Options

vmstat shows how much virtual memory there is and how much is free. The command also shows paging activity. You can observe page-ins and page-outs as they occur on the system.

Run the vmstat command without any options.
```
vmstat
```
The command generates a single report. The output is broken into six sections: procs, memory, swap, io, system, and cpu.
- The first two columns give information about processes:
  - r is the number of processes that are in a wait state. These are processes that are idle and waiting to run.
  - b is the number of processes that were in sleep mode, and were interrupted since the last update.
- The next four columns give information about memory:
  - swpd is the amount of used virtual memory.
  - free is the amount of idle memory.
  - buff is the amount of memory used as buffers.
  - cache is the amount of memory used as cache.
- The next two columns give information about swap:
  - si is the amount of memory swapped in from disk (per second).
  - so is the amount of memory swapped out to disk (per second).
    Nonzero si and so numbers indicate that there is not enough physical memory, which causes the kernel to swap memory to disk.
- The next two columns report Input/Output:
  - bi is the number of blocks per second received from a block device.
  - bo is the number of blocks per second sent to a block device.
- The next two columns give the following system information:
  - in is the number of interrupts per second, including the clock.
  - cs is the number of context switches per second.
- The last five columns give the percentages of total CPU time:
  - us is the percentage of CPU cycles spent on user processes.
  - sy is the percentage of CPU cycles spent on system (kernel) processes.
  - id is the percentage of CPU cycles spent idle.
  - wa is the percentage of CPU cycles spent waiting for I/O.
  - st is the percentage of CPU cycles stolen from a virtual machine.
Run the command vmstat 1 (numerical one, and not the lowercase letter l) to view a continuous report for every second.
```
vmstat 1
```
- To terminate, press Ctrl+C.
Run the command vmstat 7 4 to run four reports seven seconds apart.
```
vmstat 7 4
```
The count option tells vmstat the number of reports to run (4), and the delay option (7) is the time interval between each report.
Run the command vmstat -s to display a table of various event counters and memory statistics.
```
vmstat -s
```
Run the command vmstat -a to display active and inactive memory.
```
vmstat -a
```
The -a option displays the amount of active and inactive (inact) memory in the memory section of the output.
Run the command vmstat -f to display the number of forks since the last boot.
```
vmstat -f
```
Run the command vmstat -t to add a timestamp to the output.
```
vmstat -t
```
Run the command vmstat -d to display disk usage statistics on the system.
```
vmstat -d
```
Run the command vmstat -p sda1 to create a report on a specific disk partition.
```
vmstat -p sda1
```
The output shows a summary for the partition, including the number or reads and writes.

Explore and Use `mpstat` Command Options

The mpstat command is used for collecting and displaying performance statistics for all logical CPUs in the system. When a CPU is occupied by a process, it is unavailable for processing other requests. These other processes must wait until the CPU is free.

Run the command mpstat without any options.
```
mpstat
```
The first line displays the Linux kernel version, host name, current date, architecture, and number of CPUs on your system.
The first column in the next line provides a timestamp, with the remaining columns defined as follows:
- CPU is the processor designated by the number starting at 0 or the keyword all indicating that statistics are calculated as averages among all processors.
- %user is the percentage of CPU used while executing applications at the user level.
- %nice is the percentage of CPU used while executing at the user level with nice priority.
- %sys is the percentage of CPU used while executing at the system (kernel) level.
  The mpstat command does not include time spent servicing hardware and software interrupts.
- %iowait is the percentage of time the CPUs were idle while the system had an outstanding disk I/O request.
- %irq is the percentage of time spent by the CPUs to service hardware interrupts.
- %soft is the percentage of time spent by the CPUs to service software interrupts.
- %steal is the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtual processor.
- %guest is the percentage of time spent by the CPUs to run a virtual processor.
- %gnice is the percentage of time spend by the CPUs to run a niced guest.
- %idle is the percentage of time that the CPU was (or the CPUs were) idle, and the system did not have an outstanding disk I/O request.
Run the command mpstat 2 5 to view CPU activity every two seconds for a total of five reports on all CPUs.
```
mpstat 2 5
```
The output prints one line of information a total of five times, and also prints an average.
Run the command mpstat -P ALL to report CPU usage on each CPU.
```
mpstat -P ALL
```
This output prints a single line of activity for all CPUs and for each CPU. Note: mpstat also prints the average CPU utilization for the specified period.
Run the command mpstat -P ALL 2 5 to view CPU activity every two seconds on the CPUs.
```
mpstat -P ALL 2 5
```
This output prints CPU utilization statistics for each CPU 5 times at an interval of 2 seconds. Note: that mpstat also prints the average CPU utilization for the specified period.
Run the command mpstat -A to print CPU utilization and interrupt statistics together in the same command output.
```
mpstat -A
```

Explore the `top` Command

The top command provides an ongoing look at processor activity in real time. top displays a list of the most CPU-intensive processes or tasks on the system and provides a limited interactive interface for manipulating processes.

Run the top command without any options for a point-in-time view of the CPU activity.
```
top
```
The output refreshes every three seconds by default and divided into two main sections.
The upper section displays general information such as the load averages during the last 1, 5, and 15 minutes, the number of running and sleeping tasks, and overall CPU and memory usage.
- Use the following to toggle the output displayed in the upper section on or off:
  - Enter the lowercase letter l (Not numerical 1) to toggle load average and uptime off and on.
  - Enter m to toggle memory and swap usage off and on.
  - Enter t to toggle between the 4 different CPU modes.
The lower section displays a sorted list of processes, usually by CPU usage, along with their process ID numbers (PIDs), and the user who owns the process. In addition, the outputs displays running time and memory that the processes use.
The following describes the columns in the lower section:
- PID is the task's unique process ID.
- USER is the effective username of the task's owner.
- PR is the actual priority of the task.
- NI is the nice value of the task in the user-space.
  A negative value means a higher priority; and a positive value means a lower priority. The value of zero means a task's priority does not effect when it executes.
- VIRT is the total amount of virtual memory used by the task.
  This value includes all code, data, and shared libraries, plus pages that have been swapped out.
- RES is the non-swapped physical memory or resident size a task is using.
- SHR is the amount of shared memory the task is using.
  This memory could potentially be shared with other processes.
- S is the status of the task. There are five states:
  - D Uninterruptible sleep
  - R Running
  - S Sleeping
  - T Traced or stopped
  - Z Zombie
- %CPU is the task's share of the elapsed CPU time or CPU usage since the last screen update, expressed as a percentage of total CPU time.
- %MEM is the task's currently used share of available physical memory or memory usage.
- TIME+ is the total CPU time that the task has used since it started.
- COMMAND is the command-line or program name used to start a task.
Exit top by entering Crtl+C.

Compare Command Output from `vmstat`, `mpstat`, and `top`

For this practice, you run various stress tests to emulate different loads on your lab system, and then use the vmstat, mpstat, and top commands to determine what they report for the different types of load.

Even though you run the tests as background processes, consider opening additional terminal windows for your comparisons.

Each additional window requires logging in to your lab instance using ssh to oracle@<IP_ADDRESS_OF_COMPUTE_INSTANCE>. Where <IP_ADDRESS_OF_COMPUTE_INSTANCE> is the IP address copied from the Oracle Cloud Console.

After logging in, use the sudo -i command to switch to the root user.

Install the stress tool.
1. Use the dnf repolist command to verify the status of the ol8_developer_EPEL repository.
```
dnf repolist ol8_developer_EPEL
```
  - If the status is enabled, proceed to installing the stress tool packages; otherwise, continue to "enabling" the repository.
  - Enable the repository if the status is disabled by using the dnf conifg-manager command.
```
dnf config-manager --enable ol8_developer_EPEL
```
Run the dnf install command to install the stress tool packages.
```
dnf install stress -y
```
Run the command stress --dry-run to view an example of the command syntax. Take a moment and review the options in the example, along with the other command options listed.
```
stress --dry-run
```
Run the vmstat 4 4, and mpstat 4 4 commands to view a baseline of system activity. Specifically, notice the CPU percentages allocated to user and system (kernel) processes. These column should be close to zero 0. Also note the amount of free memory.
```
vmstat 4 4
mpstat 4 4
```
Run the command stress --cpu to emulate a compute-bound program, and run the command in the background. This command spawns eight compute-bound processes.
```
stress --cpu 8 &
```
- Press return to return to the prompt.
- Run the ps -ef command to view the running stress processes. Pipe the output to grep stress.
```
ps -ef | grep stress
```
Run the vmstat 4 10 command and note the values reported in the columns associated with CPU utilization.
```
vmstat 4 10
```
In a few moments, you should see the percentages under the us column reflect the load generated by stress.
Run the command mpstat 4 10 to evaluate utilization of all the CPUs.
```
mpstat 4 10
```
Run the command mpstat -P ALL 4 10 to evaluate utilization of each CPU.
```
mpstat -P ALL 4 10
```
Run the top command to examine CPU utilization by the stress processes.
```
top
```
After a few minutes, notice that the stress processes are running at the highest priority. Exit top by entering a Ctrl+C.
Run the pkill command to terminate the stress processes.
```
pkill stress
```
Rerun the vmstat 4 4, and mpstat 4 4 commands to view a baseline of system activity.
Run a new stress command and add the --vm and --vm-bytes options to spawn memory activity on the system.
```
stress --cpu 8 --vm 8 --vm-bytes 512M &
```
Run vmstat 4 10, mpstat -P ALL 4 10, and top to evaluate the CPU and memory activity.
```
vmstat 4 10
mpstat -P ALL 4 10
top
```
- Exit top by entering Crtl+C.
Use the pkill command to terminate the stress processes.
```
pkill stress 
```
Rerun the vmstat 4 4, and mpstat 4 4 commands to view a baseline of system activity.
Run a new stress command and add the --io option to spawn input/output activity on the system.
```
stress --cpu 8 --vm 8 --vm-bytes 512M --io 8 &
```
Run vmstat 4 10, mpstat -P ALL 4 10, and top to evaluate the CPU, memory, input/output activity.
```
vmstat 4 10
mpstat -P ALL 4 10
top
```
- Exit top by entering Crtl+C.
- Use pkill to terminate stress.

Video Demonstrations

Check out the following videos for further knowledge on these tools.

For More Information

See other related resources: