Build a Software RAID Array on Oracle Linux
Introduction
A Redundant Array of Independent Disks or RAID device is a virtual device created from two or more real block devices. This functionality allows multiple devices (typically disk drives or partitions of a disk) to be combined into a single device to hold a single filesystem. Some RAID levels include redundancy, allowing the filesystem to survive some degree of device failure.
The Oracle Linux kernel uses the Multiple Device (MD) driver to support Linux software RAID. This driver enables you to organize disk drives into RAID devices and implement different RAID levels.
For more information on these different RAID levels, see the Oracle documentation .
This tutorial will work with the MD utility (mdadm
) to create a RAID1 device with a spare and then address a disk failure.
Objectives
In this tutorial, you will learn how to:
- Create a RAID1 device with a spare
- Recover a failed RAID1 device
Prerequisites
Minimum of a single Oracle Linux system
Each system should have Oracle Linux installed and configured with:
- A non-root user account with sudo access
- Access to the Internet
- Two or more block devices attached to the system
Deploy Oracle Linux
Note: If running in your own tenancy, read the linux-virt-labs
GitHub project README.md and complete the prerequisites before deploying the lab environment.
Open a terminal on the Luna Desktop.
Clone the
linux-virt-labs
GitHub project.git clone https://github.com/oracle-devrel/linux-virt-labs.git
Change into the working directory.
cd linux-virt-labs/ol
Install the required collections.
ansible-galaxy collection install -r requirements.yml
Deploy the lab environment.
ansible-playbook create_instance.yml -e localhost_python_interpreter="/usr/bin/python3.6" -e add_block_storage=true -e block_count=3
The free lab environment requires the extra variable
local_python_interpreter
, which setsansible_python_interpreter
for plays running on localhost. This variable is needed because the environment installs the RPM package for the Oracle Cloud Infrastructure SDK for Python, located under the python3.6 modules.The default deployment shape uses the AMD CPU and Oracle Linux 8. To use an Intel CPU or Oracle Linux 9, add
-e instance_shape="VM.Standard3.Flex"
or-e os_version="9"
to the deployment command.Important: Wait for the playbook to run successfully and reach the pause task. At this stage of the playbook, the installation of Oracle Linux is complete, and the instances are ready. Take note of the previous play, which prints the public and private IP addresses of the nodes it deploys and any other deployment information needed while running the lab.
Connect to the System
Open a terminal and connect via SSH to the ol-node-01 instance.
ssh oracle@<ip_address_of_instance>
Verify the block volumes exist.
sudo lsblk
The output for the free lab environment should show the
/dev/sda
for the existing file system, and the available disks/dev/sdb
,/dev/sdc
, and/dev/sdd
.
Install the MD Utility
Install the MD utility.
Check if
mdadm
is installed.sudo dnf list --installed mdadm
If not installed, install
mdadm
.sudo dnf -y install mdadm
Create a RAID Device.
RAID1 provides data redundancy and resilience by writing identical data to each drive in the array. If one drive fails, a mirror can satisfy I/O requests. Mirroring is an expensive solution because the system writes the same information to all of the disks in the array.
Features of RAID1:
- Includes redundancy
- Uses two or more disks with zero or more spare disks
- Maintains an exact mirror of the data written on each disk
- Disk devices should be of equal size
- If one disk device is larger than another, the RAID device will be the size of the smallest disk
- Allows up to n-1 disk devices to be removed or fail while all data remains intact
- Provided the system survives a crash and spare disks are available, recovery of the RAID1 mirror happens automatically and immediately upon detection of the fault
- Slower write performance occurs compared to a single disk due to writing the same data to multiple disks in the mirror set
List the options available to create a RAID device.
Using
mdadm --help
shows how to use the--create
option to create a new array from unused devices.sudo mdadm --create --help
Create a RAID1 device with one spare disk.
sudo mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sdb /dev/sdc --spare-devices=1 /dev/sdd
--create
: Creates the new array--level
: The raid level--raid-devices
: The number of active devices in the array--spare-devices
: The number of spare (extra) devices in the initial array
In this command, we name the device (array)
/dev/md0
and use/dev/sdb
and/dev/sdc
to create the RAID1 device. The device/dev/sdd
is automatically used as a spare to recover from any active device's failure.Accept the
Continue creating array?
prompt by typingy
and hittingENTER
.Example Output:
[oracle@ol-mdadm-2022-06-04-180415 ~]$ sudo mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sdb /dev/sdc --spare-devices=1 /dev/sdd mdadm: Note: this array has metadata at the start and may not be suitable as a boot device. If you plan to store '/boot' on this device please ensure that your boot-loader understands md/v1.x metadata, or use --metadata=0.90 mdadm: size set to 52395008K Continue creating array? y mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started.
Create a File System
Create an ext4 filesystem on the RAID device and mount it.
sudo mkfs.ext4 -F /dev/md0 sudo mkdir /u01 sudo mount /dev/md0 /u01
Report the file system disk usage.
df -h
Example Output:
[oracle@ol-mdadm-2022-06-04-180415 ~]$ df -h Filesystem Size Used Avail Use% Mounted on ... /dev/md0 49G 53M 47G 1% /u01
Add an entry to /etc/fstab and make the mount point persistent across reboots.
echo "/dev/md0 /data01 ext4 defaults 0 0" | sudo tee -a /etc/fstab > /dev/null
Verify RAID Device
Get details about the array.
sudo mdadm --detail /dev/md0
Example Output:
[oracle@ol-mdadm-2022-06-04-180415 ~]$ sudo mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Sat Jun 4 20:08:32 2022 Raid Level : raid1 Array Size : 52395008 (49.97 GiB 53.65 GB) Used Dev Size : 52395008 (49.97 GiB 53.65 GB) Raid Devices : 2 Total Devices : 3 Persistence : Superblock is persistent Update Time : Sat Jun 4 20:28:58 2022 State : clean, resyncing Active Devices : 2 Working Devices : 3 Failed Devices : 0 Spare Devices : 1 Consistency Policy : resync Resync Status : 59% complete Name : ol-mdadm-2022-06-04-180415:0 (local to host ol-mdadm-2022-06-04-180415) UUID : f6c35144:66a24ae9:5b96e616:f7252a9f Events : 9 Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync /dev/sdc 2 8 48 - spare /dev/sdd
In the output, the
State
shows the array is clean and resyncing. The resysc always occurs after the initial creation of the array, or after recovery. The output shows the resync is 59% complete.Check real-time information from the kernel.
sudo cat /proc/mdstat
Example Output:
[oracle@ol-mdadm-2022-06-04-180415 ~]$ cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdd[2](S) sdc[1] sdb[0] 52395008 blocks super 1.2 [2/2] [UU] [==================>..] resync = 92.2% (48341824/52395008) finish=2.7min speed=24677K/sec unused devices: <none>
Create RAID Configuration File
Add the RAID configuration to the
mdadm
configuration file.The configuration file identifies which devices are RAID devices and to which array a specific device belongs. Based on this configuration file,
mdadm
can assemble the arrays at boot time.sudo mdadm --examine --scan | sudo tee -a /etc/mdadm.conf
Example Output:
[oracle@ol-mdadm-2022-06-04-180415 ~]$ sudo mdadm --examine --scan | sudo tee -a /etc/mdadm.conf ARRAY /dev/md/0 metadata=1.2 UUID=34a52537:38660137:d8804219:dbfd7531 name=ol-node-01:0 spares=1
Adjust the name value in the configuration file.
Due to a known issue in the latest mdadm package that causes a Not POSIX compatible warning in how
mdadm --examine --scan
assembles the configuration file, we must remove the trailing:0
in the name value.sudo sed -i 's/ol-node-01:0/ol-node-01/g' /etc/mdadm.conf
Manage RAID Devices
This option manages the component devices within an array, such as adding, removing, or faulting a device.
List the options available to manage a RAID device.
sudo mdadm --manage --help
--add
: Hotadd subsequent devices.--remove
: Remove subsequent non-active devices.--fail
: Mark subsequent devices as faulty.
Synchronize cached writes to persistent storage.
Before running any disk management commands, you must run the
sync
command to write all disk caches to disk.sudo sync
Mark a disk as failed.
sudo mdadm --manage /dev/md0 --fail /dev/sdb
Get array details.
sudo mdadm --detail /dev/md0
Example Output:
[oracle@ol-mdadm-2022-06-04-180415 ~]$ sudo mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Sat Jun 4 20:08:32 2022 Raid Level : raid1 Array Size : 52395008 (49.97 GiB 53.65 GB) Used Dev Size : 52395008 (49.97 GiB 53.65 GB) Raid Devices : 2 Total Devices : 3 Persistence : Superblock is persistent Update Time : Sat Jun 4 21:34:19 2022 State : clean, degraded, recovering Active Devices : 1 Working Devices : 2 Failed Devices : 1 Spare Devices : 1 Consistency Policy : resync Rebuild Status : 1% complete Name : ol-mdadm-2022-06-04-180415:0 (local to host ol-mdadm-2022-06-04-180415) UUID : f6c35144:66a24ae9:5b96e616:f7252a9f Events : 19 Number Major Minor RaidDevice State 2 8 48 0 spare rebuilding /dev/sdd 1 8 32 1 active sync /dev/sdc 0 8 16 - faulty /dev/sdb
The array is marked as degraded and recovering. The output also shows that the spare device
/dev/sdd
is automatically rebuilding the array, while/dev/sdb
is faulty.Remove the failed disk.
sudo mdadm --manage /dev/md0 --remove /dev/sdb
Replace the failed disk.
If this was a physical system, this is when you replace the server's failed physical disk with a new one. For a virtual environment, you can repurpose the disk without any changes.
Remove previous linux_raid_member signature.
A signature, metadata, is written on a disk when used in a RAID array, and the disk cannot be moved to another system or repurposed until removing those signatures.
sudo wipefs -a -f /dev/sdb
Warning: The
wipefs
command is destructive and removes the entire partition table on the target disk (/dev/sdb
) and any signatures.Add a new spare to the array.
sudo mdadm --manage /dev/md0 --add /dev/sdb
Verify the spare disk exists.
sudo mdadm --detail /dev/md0
At the bottom of the output, the device
/dev/sdb
should appear in the list with the State set to spare.
Next Steps
You should now be able to create a RAID1 device with a spare, and know how to recover when one fails. Check out our other content on the Oracle Linux Training Station.