Configure RAID Logical Volumes on Oracle Linux

1
0
Send lab feedback

Configure RAID Logical Volumes on Oracle Linux

Introduction

LVM RAID is a way to create a Logical Volume (LV) that uses multiple physical devices to improve performance or tolerate device failures. In LVM, the physical devices are Physical Volumes (PVs) in a single Volume Group (VG).

This tutorial will work with the Oracle Linux Volume Manager utilities to create a RAID logical volume and then address a disk failure.

Objectives

  • Create a RAID logical volume
  • Resize a RAID logical volume
  • Recover a failed RAID device

Prerequisites

Any Oracle Linux 8 system with the following configurations:

  • a non-root user with sudo permissions
  • additional block volumes for use with LVM

Setup Lab Environment

Note: When using the free lab environment, see Oracle Linux Lab Basics for connection and other usage instructions.

  1. If not already connected, open a terminal and connect via ssh to each instance mentioned above.

    ssh oracle@<ip_address_of_instance>
  2. Verify the block volumes exist.

    sudo lsblk

    The output for the free lab environment should show the /dev/sda for the existing file system, and the available disks /dev/sdb, /dev/sdc, /dev/sdd, and /dev/sde. There are also two additional disks (/dev/sdf, /dev/sdg) which we'll use later.

Physical Volume (PV)

  1. Create the physical volumes (PV) using the available disks.

    sudo pvcreate -v /dev/sd[b-e]

    Run the command with the -v option to get verbose information.

  2. Verify PV creation.

    sudo pvs

    Example Output:

    [oracle@ol-node01 ~]$ sudo pvs
      PV         VG        Fmt  Attr PSize  PFree 
      /dev/sda3  ocivolume lvm2 a--  45.47g     0 
      /dev/sdb             lvm2 ---  50.00g 50.00g
      /dev/sdc             lvm2 ---  50.00g 50.00g
      /dev/sdd             lvm2 ---  50.00g 50.00g
      /dev/sde             lvm2 ---  50.00g 50.00g

Volume Group (VG)

  1. Create the volume group (VG) using the newly created physical volumes.

    sudo vgcreate -v foo /dev/sd[b-e]
  2. Verify VG creation.

    sudo vgs

    Example Output:

    [oracle@ol-node01 ~]$ sudo vgs
      VG             #PV #LV #SN Attr   VSize   VFree  
      foo              4   0   0 wz--n- 199.98g 199.98g
      ocivolume        1   2   0 wz--n-  45.47g      0 

Logical Volume (LV)

  1. Create the RAID logical volume (LV).

    sudo lvcreate --type raid5 -i 3 -L 5G -n rr foo
    • --type: Set the RAID level. LVM supports RAID levels 0, 1, 4, 5, 6, and 10.
    • -i: Set the number (n) of stripes (devices) for a RAID 4/5/6 logical volume. A raid5 LV requires n+1 devices.
    • -L: Total size of the RAID array.
    • -n: Name of the RAID array.

    Example Output:

    [oracle@ol-node01 ~]$ sudo lvcreate --type raid5 -i 3 -L 5G -n rr foo
      Using default stripesize 64.00 KiB.
      Rounding size 5.00 GiB (1280 extents) up to stripe boundary size 5.00 GiB (1281 extents).
      Logical volume "rr" created.

    For more information check the lvmraid(7) manual page.

  2. Verify LV creation.

    sudo lvdisplay foo

    The output shows all logical volumes contained within the foo VG.

    Example Output:

    [oracle@ol-node01 ~]$ sudo lvdisplay foo
      --- Logical volume ---
      LV Path                /dev/foo/rr
      LV Name                rr
      VG Name                foo
      LV UUID                vghyRi-nKGM-3b9t-tB1I-biJX-10h6-UJWvm2
      LV Write Access        read/write
      LV Creation host, time ol-node01, 2022-05-19 01:23:46 +0000
      LV Status              available
      # open                 0
      LV Size                5.00 GiB
      Current LE             1281
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     1024
      Block device           252:10
  3. Display the LV type.

    sudo lvs -o name,segtype foo/rr
    • The lvs command can take the full LV path as an option to narrow the results.

    Example Output:

    [oracle@ol-node01 ~]$ sudo lvs -o name,segtype /dev/foo/rr
      LV     Type 
      rr     raid5

Create a File System

  1. Create an XFS file system on the RAID LV.

    sudo mkfs.xfs -f /dev/foo/rr
    • -f: Forces the overwrite of an existing file system.

    Example Output:

    [oracle@ol-node01 ~]$ sudo mkfs.xfs -f /dev/foo/rr
    meta-data=/dev/foo/rr            isize=512    agcount=8, agsize=163952 blks
             =                       sectsz=4096  attr=2, projid32bit=1
             =                       crc=1        finobt=1, sparse=1, rmapbt=0
             =                       reflink=1
    data     =                       bsize=4096   blocks=1311616, imaxpct=25
             =                       sunit=16     swidth=48 blks
    naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
    log      =internal log           bsize=4096   blocks=2560, version=2
             =                       sectsz=4096  sunit=1 blks, lazy-count=1
    realtime =none                   extsz=4096   blocks=0, rtextents=0

    Note: The XFS file system cannot reduce its size after creation. However, the xfs_growfs command can enlarge it.

Mount the RAID LV

  1. Mount the file system.

    sudo mkdir -p /u01
    sudo mount /dev/foo/rr /u01
  2. Report the file system disk usage.

    df -h

    Example Output:

    [oracle@ol-node01 ~]$ df -h
    Filesystem                         Size  Used Avail Use% Mounted on
    ...
    /dev/mapper/foo-rr                 5.0G   69M  5.0G   2% /u01

Resize a RAID LV

There are several ways to resize a RAID LV:

  • Use lvresize or lvextend to increase the LV.
  • Use lvresize or lvreduce to shrink the LV.
  • Use lvconvert with the --stripes N parameter to change the number of stripes.

Important: Shrinking an LV is risky and may result in data loss. When running an XFS file system on the LV, avoid shrinking the LV as XFS does not permit reducing the file system size.

Increase the RAID LV Capacity

  1. Using the available free space in the VG, increase the RAID LV size to 10G.

    sudo lvresize -L 10G foo/rr

    To increase the size by 10G, use the option -L +10G instead.

  2. Verify the LV increased to 10G.

    sudo lvs foo/rr

    The LSize should show 10g.

  3. Grow the file system.

    sudo xfs_growfs /u01
  4. Report the updated file system disk usage.

    df -h
  5. Check the RAID syncronization status before proceeding.

    Warning: Proceeding too quickly to the next step may show an error due to foo/rr not being in-sync.

    This error will occur if the syncronization did not complete after resizing the RAID LV above.

    Check the RAID LV with watch sudo lvs foo/rr and wait for the Cpy%Sync field to reach 100%. Once Cpy%Sync reaches 100%, use ctrl-c to exit the watch command.

See lvresize(8), lvextend(8) and lvreduce(8) man pages for more information.

Increase Stripes on RAID LV

Changing the number of stripes on a RAID LV increases overall capacity and is possible on RAID4/5/6/10. Each additional stripe requires an equal number of non-allocated physical volumes (devices) within the volume group.

  1. Check which physical volumes (PV) exist in VG foo.

    sudo pvs

    From the output /dev/sdb, /dev/sdc, /dev/sdd, and /dev/sde are all associated with VG foo.

  2. Determine if there are any unused physical volumes.

    sudo pvdisplay -m /dev/sd[b-e]

    Example Output:

      --- Physical volume ---
      PV Name               /dev/sdb
      VG Name               foo
      PV Size               50.00 GiB / not usable 4.00 MiB
      Allocatable           yes 
      PE Size               4.00 MiB
      Total PE              12799
      Free PE               11944
      Allocated PE          855
      PV UUID               Q1uEMC-0zL1-dgrA-9rIT-1xrA-Vnfr-2E8tJT
       
      --- Physical Segments ---
      Physical extent 0 to 0:
        Logical volume	/dev/foo/rr_rmeta_0
        Logical extents	0 to 0
      Physical extent 1 to 854:
        Logical volume	/dev/foo/rr_rimage_0
        Logical extents	0 to 853
      Physical extent 855 to 12798:
        FREE
    ...

    The pvdisplay command with the -m option shows the mapping of physical extents to logical volumes and logcial extents. The PV /dev/sdb in the example output shows physical extents associated with the RAID LV. The same should appear for /dev/sdc, /dev/sdd, and /dev/sde.

  3. Add another PV to the VG.

    As the existing RAID LV uses all the existing physical volumes, add /dev/sdf to the PV foo.

    sudo vgextend foo /dev/sdf

    The output shows the vgextend command converts /dev/sdf to a PV before adding it to the VG foo.

  4. Add a stripe to the RAID LV.

    sudo lvconvert --stripes 4 foo/rr

    Respond with y to the prompt.

    Example Output:

    [oracle@ol-node01 ~]$ sudo lvconvert --stripes 4 foo/rr
      Using default stripesize 64.00 KiB.
      WARNING: Adding stripes to active and open logical volume foo/rr will grow it from 2562 to 3416 extents!
      Run "lvresize -l2562 foo/rr" to shrink it or use the additional capacity.
    Are you sure you want to add 1 images to raid5 LV foo/rr? [y/n]: y
      Logical volume foo/rr successfully converted.
  5. Verify LV new size.

    sudo lvs foo/rr

    Example Output:

    [oracle@ol-node01 ~]$ sudo lvs foo/rr
      LV   VG  Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
      rr   foo rwi-aor--- 13.34g                                    2.24          

    The capacity (LSize) grew by 3.34g, and the synchronization (Cpy%Sync) begins. Synchronization is the process that makes all the devices in a RAID LV consistent with each other, and a full sync becomes necessary when devices in the RAID LV are modified or replaced.

  6. Check status of the syncronization.

    Run the check until the progress reaches 100%.

    watch sudo lvs foo/rr

    Once Cpy%Sync reaches 100%, use ctrl-c to exit the watch command.

    Other ways to use the watch command include:

    • Run watch -n 5 sudo lvs foo/rr to refresh every 5s instead of the default 2s.
    • Run timeout 60 watch -n 5 sudo lvs foo/rr to automatically exit after 60s.
  7. Show the new segment range and PV, which now includes /dev/sdf.

    sudo lvs -a -o lv_name,attr,segtype,seg_pe_ranges,dataoffset foo

Recover a Failed RAID Device in a LV

RAID arrays can continue to run with failed devices. Removing a device for RAID types other than RAID1 would imply converting to a lower level RAID (RAID5 to RAID0 in this case).

LVM permits replacing a failed device in a RAID volume in a single step using the lvconvert --repair command for failed devices rather than removing a failed drive and possibly adding a replacement.

  1. Check the current RAID LV layout.

    sudo lvs --all --options name,copy_percent,devices foo
  2. Simulate a failure on /dev/sdd.

    echo 1 | sudo tee /sys/block/sdd/device/delete
  3. After failure, recheck the RAID LV layout.

    sudo lvs --all --options name,copy_percent,devices foo

    Notice the [unknown] devices.

    Example Output:

    [oracle@ol-node01 ~]$ sudo lvs --all --options name,copy_percent,devices foo
      WARNING: Couldn't find device with uuid o1JwCl-DTpi-anww-rYt3-1LCq-vmLV-FQCKyc.
      WARNING: VG foo is missing PV o1JwCl-DTpi-anww-rYt3-1LCq-vmLV-FQCKyc (last written to /dev/sdd).
      LV            Cpy%Sync Devices                                                                   
      rr            100.00   rr_rimage_0(0),rr_rimage_1(0),rr_rimage_2(0),rr_rimage_3(0),rr_rimage_4(0)
      [rr_rimage_0]          /dev/sdb(855)                                                             
      [rr_rimage_0]          /dev/sdb(1)                                                               
      [rr_rimage_1]          /dev/sdc(855)                                                             
      [rr_rimage_1]          /dev/sdc(1)                                                               
      [rr_rimage_2]          [unknown](855)                                                            
      [rr_rimage_2]          [unknown](1)                                                              
      [rr_rimage_3]          /dev/sde(855)                                                             
      [rr_rimage_3]          /dev/sde(1)                                                               
      [rr_rimage_4]          /dev/sdf(855)                                                             
      [rr_rimage_4]          /dev/sdf(1)                                                               
      [rr_rmeta_0]           /dev/sdb(0)                                                               
      [rr_rmeta_1]           /dev/sdc(0)                                                               
      [rr_rmeta_2]           [unknown](0)                                                              
      [rr_rmeta_3]           /dev/sde(0)                                                               
      [rr_rmeta_4]           /dev/sdf(0)        
  4. Replace the failed device.

    sudo lvconvert --repair foo/rr

    Respond with y to the prompt.

    The command fails to find available space or device to use in the VG.

    Example Output:

    [oracle@ol-node01 ~]$ sudo lvconvert --repair foo/rr
      WARNING: Couldn't find device with uuid o1JwCl-DTpi-anww-rYt3-1LCq-vmLV-FQCKyc.
      WARNING: VG foo is missing PV o1JwCl-DTpi-anww-rYt3-1LCq-vmLV-FQCKyc (last written to /dev/sdd).
      WARNING: Couldn't find device with uuid o1JwCl-DTpi-anww-rYt3-1LCq-vmLV-FQCKyc.
    Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
      Insufficient free space: 856 extents needed, but only 0 available
      Failed to replace faulty devices in foo/rr.

    Warning: If the error contains a "Unable to replace devices in foo/rr while it is not in-sync" message, verify that the RAID-LV is in-sync by running watch sudo lvs foo/rr and confirming Cpy%Sync is 100%. Then try the lvconvert command again.

  5. Add the device /dev/sdg to the VG

    sudo vgextend foo /dev/sdg

    The WARNING messages in the output are due to the still missing failed drive.

  6. Retry replacing the failed drive.

    sudo lvconvert --repair foo/rr

    Respond again with y to the prompt.

    The output again shows the WARNING messages about the missing drive, but the command successfully replaced the faulty device in the VG.

  7. Examine the layout.

    sudo lvs --all --options name,copy_percent,devices foo

    Notice /dev/sdg replaced all the [unknown] device entries.

  8. Remove the failed device from the VG.

    LVM utilities will continue to report that LVM cannot find the failed device until removing it from the VG.

    sudo vgreduce --removemissing foo

    The WARNING messages in the output are due to the still missing failed drive.

  9. Check the RAID syncronization status before proceeding.

    Warning: Proceeding too quickly to the next section may show the following error message:

    Example Output:

    [oracle@ol-node01 ~]$ sudo lvchange --syncaction check foo/rr
      foo/rr state is currently "recover".  Unable to switch to "check".

    This error will occur if the syncronization did not complete after adding stripes to the RAID LV.

    Check the RAID LV with watch sudo lvs foo/rr and wait for the Cpy%Sync field to reach 100%.

Check Data Coherency in RAID LV (Scrubbing)

LVM provides scrubbing ability for RAID LV, which reads all the data and parity blocks in an array and checks for coherency.

  1. Initiate a scrub in checking mode.

    sudo lvchange --syncaction check foo/rr
  2. Show the status of the scrubbing action.

    watch sudo lvs -a -o name,raid_sync_action,sync_percent foo/rr

    Example Output:

    [oracle@ol-node01 ~]$ sudo lvs -a -o name,raid_sync_action,sync_percent foo/rr
      LV   SyncAction Cpy%Sync
      rr   check      30.08   
  3. After scrubbing (syncronization) is complete, display the number of inconsistent blocks found.

    sudo lvs -o +raid_sync_action,raid_mismatch_count foo/rr

    The raid_sync_action option displays the SyncAction field with one of the following values:

    • idle: All actions complete.
    • resync: Initializing or Recovering after a system failure.
    • recover: Replacing a device in the array.
    • check: Looking for differences.
    • repair: Looking and repairing differences.

    Example Output:

    [oracle@ol-node01 ~]$ lvs -o +raid_sync_action,raid_mismatch_count foo/rr
      LV   VG  Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert SyncAction Mismatches
      rr   foo rwi-aor--- 13.34g                                    44.42            check               0

    The output shows 0 inconsistencies (Mismatches).

  4. (Optional) Fix the differences in the array.

    This step is optional as no differences likely exist in this sample array.

    sudo lvchange --syncaction repair foo/rr
  5. (Optional) Check status of the repair.

    sudo lvs -o +raid_sync_action,raid_mismatch_count foo/rr

    Notice the SyncAction field changed to repair.

See the lvchange(8) and lvmraid(7) man pages for more information.

For More Information:

SSR