Exadata Storage/Compute Node remote re-imaging
Posted by Vishal Gupta on Oct 23, 2011
When you have a Exadata machine in the lab and you are testing lot of different things or giving hand-ons training to production DBA on lab Exadata to familiarize them with Exadata patching, one has to frequently start from scratch i.e . a particular storage/compute node image. Wouldn’t it be nice if you could write a script to re-image the servers? But alas!!! that is not possible with Exadata. One has to do all the reimaging manually. Even with the manual process, one has to insert the external USB into each storage/compute node and then remote it before reboot to complete the process. With ILOM (Integrated Lights Out Management) it is possible to mount a remote cdrom image (*.iso) file as a virtual cdrom. One can create an storage/compute node image iso using computeImageMaker_<imageversion>.x86_64.tar or cellImageMaker_<imageversion>.x86_64.tar file.
For example for compute node, please note we navigate into dl380 for compute node
# tar -pxvf computeImageMaker_22.214.171.124.3_LINUX.X64_100305-1.x86_64.tar # cd dl380 # makeImageMedia.sh computeNodeImage.iso
Or for cell node, please note we navigate into dl180 directory for cell.
# tar -pxvf cellImageMaker_126.96.36.199.3_LINUX.X64_100305-1.x86_64.tar # cd dl180 # makeImageMedia.sh cellImage.iso
Would it not have been nice, if one could just mount this iso file via ILOM as virtual cdrom, change the boot order in BIOS by booting into bios, which can also be force either via ILOM gui or via /usr/bin/biosconfig -set_boot_override <xmlfile> command, and choose virtual cdrom as first boot device. But problem with this approach is cell/compute node imaging process resets the ILOM, which means that even our virtual cdrom iso image is also removed during this process and this results in imaging process not completing properly.
One could try leaving a external USB permanently attached on lab exadata, and then via ILOM try to reimage, so that during ILOM reset boot image device is not detached and once can simply reconnect to ILOM to continue answering the on-screen messages. But one problem with this approach is, as part of imaging one has to remove the external USB stick otherwise automated configuration scripts dont funtion properly. I tried it on the lab exadata and could get the cell to reimage properly with external USB still attached. Even after manually changing the boot order to internal USB and then harddrive, which is what cell checks for during each reboot validation, cell just sat on a blinking prompt without going forward. I left the cell overnight as well, thinking that properly some processing is going on, but with no luck. Once i removed the external USB stick, automated configuration scripts were able to properly complete the imaging process and came to automated run of ipconf script, which set the various setting on the cell at first boot after reimaging process.
Would it have been nice, it all this could be done remotely via ILOM once ilom has been connected to the network, as ilom network configuration is not reset during re-imaging process. But that would be wishful thinking !!!
I was trying this with 188.8.131.52.3 image, as that was my starting image on compute/cell nodes in prod/dr/test, so wanted to replicated the same history. I have not tried with later image versions yet to see if process has been improved in this regard, that trial would be for some time later.