Tuesday, January 31, 2017

Online Resizing of a Boot Partition on RHEL 6 or 7 with parted.

The Problem:

I've been getting away with using a 200MB boot partition for over a decade, but RHEL 7 has started pushing the limits of that scheme.  Between GRUB2 and the automatic creation of a rescue kernel and initramfs that's twice the size as a normal initramfs, that 200MB gets consumed much more quickly.  I can only make it through about two update cycles before having to uninstall the old kernels to avoid upgrade failures.

This is a handy command for removing all of the old kernels by the way:

yum erase $(rpm -qa kernel* | egrep -v $(uname -r))


I've updated my provisioning tools to make the boot partition larger, but obviously I still have to do something about the pre-existing systems.


All of my VMs are built with provisioning templates that create a boot partition on the first disk, then format the remainder of the first disk for LVM.  In order to expand the boot partition, it's boundary needs to be moved into space occupied by the OS itself.  Fortunately LVM is pretty flexible.  Not only can I "move" the partition boundary, I can do it while the system is up and running.

Here's what we're starting off with:

[root@azrst2ill001 ~]# parted -s /dev/sda print
Model: Msft Virtual Disk (scsi)
Disk /dev/sda: 53.7GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags:

Number  Start   End     Size    Type     File system  Flags
 1      1049kB  211MB   210MB   primary  xfs          boot
 2      211MB   53.7GB  53.5GB  primary               lvm


Two partitions on the first disk, the first partition is for /boot, the second is owned by LVM. As you can see there's no space between them: the second partition begins exactly where the first ends.


Here's what LVM looks like:

[root@azrst2ill001 ~]# vgs
  VG        #PV #LV #SN Attr   VSize   VFree
  rootvg      1   6   0 wz--n-  49.80g 21.94g

[root@azrst2ill001 ~]# pvs
  PV         VG        Fmt  Attr PSize   PFree
  /dev/sda2  rootvg    lvm2 a--   49.80g 21.94g


The primary volume group with all of the OS volumes is named "rootvg" and it is contained on a single PV - /dev/sda2.  With some quick mental math you can see it's consuming just over 30GB.

Note:These instructions assume the machine was built with the default filesystem for each release, so the RHEL 6 instructions assume you built /boot with ext4, otherwise the instructions assume you're working with RHEL 7 and built /boot with xfs.

Warning:
This process is somewhat dangerous because if it is done incorrectly or the server is rebooted before it is complete, the system could be unbootable..


The Goal:
Expand the boot partition from 200MB to 1GB without re-imaging the system.


The Solution:


1. Verify the volume group has at least 1GB free because that's basically where the extra space for /boot is going to come from.

vgs


2. Add a temporary disk to the machine that's at least as large as the root volume group.  In my case I attached a new virtual data disk in Azure and 128GB was the smallest option.

UDEV created the device as /dev/sdc:

[root@azrst2ill001 boot]# lsblk
NAME                          MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                             8:0    0   50G  0 disk
├─sda1                          8:1    0  953M  0 part /boot
└─sda2                          8:2    0 49.1G  0 part
  ├─rootvg-rootlv (dm-0)      253:0    0    8G  0 lvm  /
  ├─rootvg-localhomelv (dm-1) 253:1    0    2G  0 lvm  /usr/local/home
  ├─rootvg-optlv (dm-2)       253:2    0    5G  0 lvm  /opt
  ├─rootvg-swaplv (dm-3)      253:3    0    2G  0 lvm
  ├─rootvg-tmplv (dm-4)       253:4    0    5G  0 lvm  /tmp
  └─rootvg-varlv (dm-5)       253:5    0    4G  0 lvm  /var
sdc                             8:32   0  128G  0 disk
sdb                             8:16   0   56G  0 disk
└─sdb1                          8:17   0   56G  0 part /mnt/resource




3. Add the new disk to the primary VG

vgextend rootvg /dev/sdc


4. Move all of the physical extents (the data basically) off of /dev/sda2. This process will take some time.

pvmove /dev/sda2 


5. Once all of the data is cleared off of /dev/sda2, delete it from the primary VG.

vgreduce rootvg /dev/sda2


6. Just for good measure remove /dev/sda2 from LVM altogether

pvremove /dev/sda2


7. (Optional) At this point I wanted to make sure I had some kind of fallback if I dorked something up, so I made a couple of quick backups.

Firstly, I backed up the whole boot partition:

dd if=/dev/sda1 of=/root/bootpart.img

Next I backed up the MBR

dd if=/dev/sda of=/root/mbr.img bs=512 count=1

These backups are only really helpful if you realize you screwed something up before trying to reboot.


8. Delete the LVM partition

parted -s /dev/sda rm 2


9. Put the starting position of the boot partition into a variable

START1=$(parted -s /dev/sda print | grep "^ 1" | awk '{print $2}')


10. (The scary part) remove the boot partition.  

You can get away with this because the only thing parted is manipulating is the partition table.  The actual filesystem data is unaffected by this.  As long as you put the start of the boot partition in exactly the same place (which is why we saved that place in a variable before deleting it) everything will line up and work normally.

umount /boot
parted -s /dev/sda rm 1


11. Re-create the boot partition

parted -s /dev/sda mkpart primary xfs $START1 1GB

If you're doing this on RHEL 6, you probably want to go with ext4 instead:

parted -s /dev/sda mkpart primary ext4 $START1 1GB


12. Set the partition to bootable

parted -s /dev/sda set 1 boot on


13. Check the filesystem.

xfs_repair /dev/sda1

Note: if you get an error message  "/dev/sda1 contains a mounted filesystem" just run "umount /boot" again right before you run the command. For some reason RHEL 7 kept re-mounting /boot automatically on me(autofs isn't even installed).  I didn't take the time to figure out what was causing this behavior, but it's easy enough to work around.

If you're doing this on RHEL 6:

fsck -f /dev/sda1


14. Mount the /boot filesystem (if it didn't already mount itself)

mount /boot


15. Resize the /boot filesystem to fill the new larger partition.

xfs_growfs /dev/sda1

If you're doing this on RHEL 6:

resize2fs /dev/sda1


16. I want to use the rest of the space on /dev/sda for the second partition.  To get the start and end positions of free space, I used parted to set some variables.

START2=$(parted -s /dev/sda print free | grep Free | tail -1 | awk '{print $1}')
END2=$(parted -s /dev/sda print free | grep Free | tail -1 | awk '{print $2}')


17. Create the new LVM partition (the -- tells parted to accept a blank value for filesystem because we're going to use the partition for LVM instead of installing a filesystem onto it.)

parted -s /dev/sda mkpart primary -- $START2 $END2


18. Set the partition flag to LVM

parted -s /dev/sda set 2 lvm on


19. Check to make sure everything looks right - most importantly that the boot partition is showing the desired new size, and that it's set to bootable, and that when you do an ls on /boot you can see the kernel and intramfs files.


[root@azrst2ill001 ~]# parted -s /dev/sda print
Model: Msft Virtual Disk (scsi)
Disk /dev/sda: 53.7GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags:

Number  Start   End     Size    Type     File system  Flags
 1      1049kB  1075MB  1074MB  primary  xfs          boot
 2      1075MB  53.7GB  52.6GB  primary               lvm



[root@azrst2ill001 ~]# ls /boot
config-3.10.0-327.10.1.el7.x86_64                        initrd-plymouth.img
config-3.10.0-327.28.3.el7.x86_64                        symvers-3.10.0-327.10.1.el7.x86_64.gz
grub2                                                    symvers-3.10.0-327.28.3.el7.x86_64.gz
initramfs-0-rescue-edcc5bfcc87f4b90a0ae36219b0138e3.img  System.map-3.10.0-327.10.1.el7.x86_64
initramfs-3.10.0-327.10.1.el7.x86_64.img                 System.map-3.10.0-327.28.3.el7.x86_64
initramfs-3.10.0-327.10.1.el7.x86_64kdump.img            vmlinuz-0-rescue-edcc5bfcc87f4b90a0ae36219b0138e3
initramfs-3.10.0-327.28.3.el7.x86_64.img                 vmlinuz-3.10.0-327.10.1.el7.x86_64
initramfs-3.10.0-327.28.3.el7.x86_64kdump.img            vmlinuz-3.10.0-327.28.3.el7.x86_64



20. Move the rootvg volume group back to the first disk, and remove the temporary disk from the VG.

vgextend rootvg /dev/sda2
pvmove /dev/sdc
vgreduce rootvg /dev/sdc
pvremove /dev/sdc


At this point it should be safe to remove the temporary disk you added in step 2, and to delete the backup .img files if you chose to create them in step 7.


Even though you can perform this whole operation with no downtime to the system, I strongly recommend rebooting at this point to verify everything is working as it should.  Better to find out now than months from now when it's not fresh in your mind.

No comments: