Wednesday, September 06, 2006

Installing RHEL on SATA using an Adaptec 1210SA Controller

I have a Supermicro server in my lab with an Adaptec 1210SA controller connecting to a couple of SATA drives I use for testing. Given that Adaptec does not provide an RHEL driver, I've had a hard time installing the OS until I had an epiphany a week ago. Adaptec may not provide an RHEL driver for the 1210SA card they do provide a driver for the 2020SA card. Here's how I got around this little problem:

1) Got to the Adaptec site and download the RHEL driver for the 2020SA card.
2) Download and install the RAWWRITE binary for Windows

3) After downloading the RHEL package, unzip it, select the driver image based on the server's architecture, and use RAWWRITE to copy it into a floppy.

4) Power on the server, insert the RHEL CD #1 into the CDROM, and at the boot prompt type: linux dd

5) During the install you will be asked if you want to install additional drivers. Insert the Floppy and select "Yes".

At this point the driver will be loaded and then you can proceed with the OS installation.

I need to stress that this is not the recommended way of doing things but rather a workaround I use for Lab purposes only. I don't even use this system for demos. If you are considering placing such a server in production, I would highly recommend that you purchase a controller with support for the OS version you need to install.

Tuesday, September 05, 2006

VMware ESX 3.0.0 SAN Booting

One of the ways enterprises today with large numbers of servers are reducing costs and enable greater storage consolidation is by deploying diskless servers that boot from the SAN (FC or IP). While this technique is not new, the introduction of the Bladeserver, which provides greater manageability, reduced HW costs, simpler cable management as well as providing power, cooling and real-estate savings, has further accelerate the adoption of SAN booting.

Booting from the SAN provides several advantages:


  • Disaster Recovery - Boot images stored on disk arrays can be easily replicated to remote sites where standby servers of the same HW type can boot quickly, minimizing the negative effect a disaster can have to the business.
  • Snapshots - Boot images in shapshots can be quickly reverted back to a point-in-time, saving time and money in rebuilding a server from scratch.
  • Quick deployment of Servers - Master Boot images stored on disk arrays can be easily cloned using Netapp's FlexClone capabilities providing rapid deployment of additional physical servers.
  • Centralized Management - Because the Master image is located in the SAN, upgrades and patches are managed centrally and are installed only on the master boot image which can be then cloned and mapped to the various servers. No more multiple upgrades or patch installs.
  • Greater Storage consolidation - Because the boot image resides in the SAN, there is no need to purchase internal drives.
  • Greater Protection - Disk arrays provide greater data protection, availability and resiliency features than servers. For example, Netapp's RAID-DP functionality provides additional protection in the event of a Dual drive failure. RAID-DP with SyncMirror, also protects against disk drive enclosure failure, Loop failure, cable failure, back-end HBA failure or any 4 concurrent drive failures

Having mentioned the advantages, it's only fair that we also mention the disadvantages which even though are being outnumbered they still exist:

  • Complexity - SAN Booting is a more complex process than booting from an internal drive. In certain cases, the troubleshooting process may be a bit more difficult especially if a coredump file can not be obtained.
  • Variable Requirements - The requirements and support from array vendor to array vendor will vary and specific configurations may not even be supported. The requirements will also vary based on the type of OS that is being loaded. Always consult with the disk array vendor before you decide to boot from the fabric.

One of the most popular platforms that lends itself to booting from the SAN is VMware ESX server 3.0.0. One reason is that VMware does not support booting from internal IDE or SATA drives. The second reason is that more and more enterprises have started to deploy ESX 3.0.0 on diskless blade servers consolidating hundreds of physical servers into few blades in a single blade chassis with the deployment of VMware's server virtualization capabilities.

The new ESX 3.0.0 release has made significant advancements in supporting boot from the SAN as the multiple and annoying requirements from the previous release have been addressed.

Here are some differences between the 2.5.x and 3.0.0 versions with regards to the SAN booting requirements:



If you are going to be booting ESX server from the SAN, I highly recommend that prior to making any HBA purchasing decisions, you contact your storage vendor and carefully review VMware's SAN Compatibility Guide for ESX server 3.0 . What you will find is that certain model Emulex and Qlogic HBAs are not supported for SAN booting as well as certain OEM'd/rebranded versions of Qlogic HBAs.

The setup process is rather trivial, however there are some things you will need to be aware of in order to achieve higher performance, and non-disruptive failovers should HW failures occur:

1) Enable the BIOS on only 1 HBA. You only need to enable the BIOS on the 2nd HBA should you have a need to reboot the server while either the original HBA used for booting purposes, the cable or the FC switch has failed. In this scenario, you would use Qlogic's Fast!UTIL to select the Active HBA, enable the BIOS, scan the BUS to discover the boot LUN, and assign the WWPN and LUN ID to the active HBA. However, when both HBA connections are functional only one needs to have its BIOS enabled.

2) One important option that needs to be modified is the Execution Throttle/Queue Depth which signifies the maximum number of Outstanding commands that can execute on anyone HBA port. The default for ESX 3.0.0 is 32. The value you use is dependent on a couple of factors:

  • Total Number of LUNs exposed thru the Array Target Port(s)
  • Array Target Port Queue Depth

The formula to determine the value is: Queue Depth = Target Queue Depth / Total number of LUNs mapped. This formula will guarantee that a fast load on every LUN will not flood the Target Port resulting in QFULL conditions. For example, if a Target Port has a queue depth of 1024 and 64 LUNs are exposed thru that port then the Queue Depth on each host should be set to 16. This is the safest approach and guarantees no QFULL conditions because 16 LUNs x 64 = Target Port Queue Depth

If using the same formula, you only consider LUNs mapped to one Host at a time then the potential for QFULL conditions exists. Using the above example, lets assume that we have a total of 64 LUNs and 4 ESX hosts each of which has 16 LUN mapped.

Then the calculation becomes: Queue Depth = 1024 / 16 = 64. But a fast load on all 64 LUNs produces: 64 x 64 = 4096 which is much greater than Queue Depth of the Physical Array Target Port. This will most certainly generate a QFULL condition.

As a rule of thumb, after the queue depth calculation, always allow some room for future expansion, in case more LUNs need to be created and mapped. Thus, consider setting the queue depth value a bit lower than the calculated one. How low is strictly dependent on future growth and requirements. As an alternative you could use Netapp's Dynamic Queue Depth Management solution which allows queue depth management from the array side rather than the host.

To Change the Queue Depth on a Qlogic HBA:

2a) Create a copy /etc/vmware/esx.conf

2b) Locate the following entry for each HBA:

/device/002:02.0/name = "QLogic Corp QLA231x/2340 (rev 02)"

/device/002:02.0/options = ""

2c) Modify as following:

/device/002:02.0/name = "QLogic Corp QLA231x/2340 (rev 02)"

/device/002:02.0/options = "ql2xmaxqdepth= xxx"

2d) Reboot

Where xxx is the queue depth value.

3) Another important option that will need modification using Fast!UTIL is the PortDownRetryCount parameter. This value will need to be set to the value recommended by your storage vendor. This setting specifies the number of times the adapter's driver retries a command to a port returning port down status. This value for ESX server is 2* n+5. Where n is the value of PortDownRetryCount from the HBA BIOS. You can change this value directly in the HBA or you can do it after you've installed ESX by editing the /etc/vmware/esx.conf file. Upon editing the file, locate the "options=" entry under the HBA model you are using and make the following change:

3a) Create a copy of /etc/vmware/esx.conf

3b) Locate the following entry for each HBA:

/device/002:02.0/name = "QLogic Corp QLA231x/2340 (rev 02)"
/device/002:02.0/options = ""

3c) Modify as following:

/device/002:02.0/name = "QLogic Corp QLA231x/2340 (rev 02)"
/device/002:02.0/options = "qlport_down_retry= xxx"

3d) Reboot

Where xxx is the value recommended by your storage vendor. The equivalent setting for Emulex HBAs is "lpfc_nodedev_tmo". The default is 30".


In closing, before you decide what your setup will be, you will need to decide whether or not booting from the SAN makes sense for you and whether your storage vendor supports the configuration(s) you have in mind. In general, if you do not want to independently manage large server farms with internal drives, if you are deploying diskless blades or if you would like to take advantage of Disk array based snapshots and cloning techniques for rapid recovery and deployement then you are a candidate for SAN booting.

IBM Bladecenter iSCSI Boot Support

There has been a lot of demand lately to boot blade servers using the integrated NICs without the use of iSCSI HBAs.

IBM has partnered with Microsoft to enable this capability for the IBM HS20 (Type 8843) Blades and Netapp has recently announced support for it.

Here are the requirements:

Blade type: HS20 MT8843
BIOS: 1.08
HS Blade Baseboard/Management Controller: 1.16
Windows 2003 SP1 w/ KB902113 Hot Fix
Microsoft iSCSI initiator with Intergrated boot support: 2.02
Netapp DataONTAP: >= 7.1.1
Netapp iSCSI Windows Initiator Support Kit 2.2 (available for download from the Netapp NOW site)

One thing to be aware of is that the Microsoft iSCSI initiator version 2.02 with Integrated Boot support is a different binary from the standard Microsoft iSCSI initiator 2.02.

To obtain the MS iSCSI initiator 2.02 with Boot support binary follow the link and provide the following invitation code: ms-8RR8-6k43

The IBM BIOS and BMC updates can be downloaded from here:
www-307.ibm.com/pc/support/site.wss/document.do?lndocid=MIGR-64042 or here

http://www-03.ibm.com/servers/eserver/support/bladecenter/hs20/downloadinghwwindows.html

You can find instructions for the process here:

ftp://ftp.software.ibm.com/pc/pccbbs/pc_servers_pdf/iscsi_boot_san_configuration_guide.pdf