System Recovery for an HPE HyperConverged 250 running VMware

This is a technical post for someone trying to reset a node or an entire HC250 appliance running VMware.  This is specific to the latest release of the recovery software for the HC250 running ESXi 6.0 update 2.  If you have followed the directions for restoring the node which are included in the HPE Hyper Converged 250 System for VMware vSphere User Guide then you will have downloaded the necessary files and created a USB drive to perform the node reset.  And that’s where things start to fall apart.

If you are following the guide, the first issue you’ll run into is the fact that they are asking you to use UNetbootin to create the USB stick.  Unfortunately, UNetbootin doesn’t work on Windows 10, so don’t even bother.  Instead use Rufus, because it is awesome and doesn’t require you to actually install anything in Windows.  The second thing is that, assuming you are using iLO, you don’t even need to create a USB stick.  iLO allows you to mount a local folder from your laptop as a USB stick.  First unzip the ISO (using 7-zip of course) currently named HPE_HC250_VMware_ESXi_6.0_U2_K2Q48-10601.iso to a folder.  Then from within an iLO Integrated Remote Console session click the Virtual Drives drop-down, select Folder, and navigate to the folder you just unzip to.  But don’t do that yet, because there is a third issue.

The user guide directs you to unzip the HPE_HC250_USB_Recovery_Tools_6.0_K2Q48-10610.zip file, and copy the contents to the bootable USB drive, selecting to overwrite the boot.cfg and syslinux.cfg files.  If you do that and boot your server, it will just boot into the vanilla ESXi installer.  That’s not what you’re looking for, and obviously not very helpful.  The recovery process works by changing the boot.cfg to specify a kickstart file.  Here’s the default boot.cfg file:

bootstate=0
title=Loading ESXi installer
timeout=5
kernel=/tboot.b00
kernelopt=runweasel
modules=(…)

I have excluded the modules section for brevity.  The recovery boot.cfg looks like this:

bootstate=0
title=Loading ESXi installer (USB Reset CS250 TD3.5)
timeout=5
kernel=/tboot.b00
#kernelopt=runweasel
kernelopt=ks=usb:/HPE-CS250-SV-v6-0_T3-5usb.cfg
modules=(…)

As you can see, the kernelopt has a kickstart file specified and the runweasel kernel option has been commented out.  The HPE-CS250-SV-v6-0_T3-5usb.cfg file contains the script to properly reset the node back to factory defaults.  When I first tried the process, the installer took me directly to the standard ESXi installer.  Having not performed a factory reset before, I thought this was normal, so I walked through the installer using defaults.  Then I expected it would run some special post install process to do the rest.  It did not.  After a few hours, learning more that I ever wanted about kickstart scripting and automating the ESXi install process, I realized that the title of the install screen was not “Loading ESXi installer (USB Reset CS250 TD3.5)”.  Which lead me to realize that the wrong boot.cfg was being loaded.

It turns out that in the installer ISO there are two boot.cfg files.  The first is in the root of the ISO, and the second is in the efi\boot subfolder.  Since the Gen9 server is booting in UEFI mode, it will use the boot.cfg file found in that folder as opposed to the root folder.  The simple solution is to copy the boot.cfg file to both folders and then create the USB drive or mount the folder in iLO IRC.

I’ve let HPE know about this, so hopefully they’ll update the documentation soon.  If they do, I’ll update the post to reflect it.  On the bright side, I know a lot more about scripting an ESXi install than I did 24 hours ago!

One thought on “System Recovery for an HPE HyperConverged 250 running VMware

  1. Pingback: URL

Leave a Reply

Your email address will not be published. Required fields are marked *