Thursday, August 31, 2006

Linux Native Multipathing (Device Mapper-Multipath)

Over the past couple of years a flurry of OS Native multipathing solutions have become available. As a result we are seeing a trend towards these solutions and away from vendor specific multipathing software.

The latest OS Native multipathing solution is Device Mappper-Multipath (DM-Multipath) available with Red Hat Enterprise Linux 4.0 U2 and SuSE SLES 9.0 PS2.

I had the opportunity to configure it in my lab a couple of days ago and I was pleasantly surprised as to how easy was to configure it. Before I show how it's done, let me talk a little about how it works.

The multipathing layer sits above the protocols (FCP or iSCSI), and determines whether or not the devices discovered on the target, represent separate devices or whether they are just separate paths to the same device. In this case, Device Mapper (DM) is the multipathing layer for Linux.

To determine which SCSI devices/paths correspond to the same LUN, the DM initiates a SCSI Inquiry. The inquiry response, among other things, carries the LUN serial number. Regardless of the number paths a LUN is associated with, the serial number for the LUN will always be the same. This is how multipathing SW determines which and how many paths are associated with each LUN.

Before you get started you want to make a sure the following things are loaded:

  • device-mapper-1.01-1.6 RPM is loaded
  • multipath-tools-0.4.5-0.11
  • Netapp FCP Linux Host Utilities 3.0

Make a copy of the /etc/multipath.conf file. Edit the original file and make sure you only have the following entries uncommented out. If you don't have Netapp the section then add it.

defaults {
user_friendly_names yes
devnode_blacklist {
devnode "sd[a-b]$"
devnode "^(ramrawloopfdmddm-srscdst)[0-9]*"
devnode "^hd[a-z]"
devnode "^cciss!c[0-9]d[0-9]*"

devices {
device {
vendor "NETAPP "
product "LUN"
path_grouping_policy group_by_prio
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
prio_callout"/opt/netapp/santools/mpath_prio_ontap /dev/n"
features "1 queue_if_no_path"
path_checker readsector0
failback immediate

The devnode_blacklist includes devices for which you do not want multipathing enabled. So if you have a couple of local SCSI drives (i.e sda and sdb) the first entry in the blacklist will exclude them. Same for IDE drives (hd).

Add the multipath service to the boot sequence by entering the following:

chkconfig --add multipathd
chkconfig multipathd on

Multipathing on Linux is Active/Active with a Round-Robin algorithm.

The path_grouping_policy is group_by_prio. It assigns paths into Path Groups based on path priority values. Each path is given a priority (high value = high priority) based on a callout program written by Netapp Engineering (part of the FCP Linux Host Utilities 3.0).

The priority values for each path in a Path Group are summed and you obtain a group priority value. The paths belonging to the Path Group with the higher priority value are used for I/O.

If a path fails, the value of the failed path is subtracted from the Path Group priority value. If the Path Group priority value is still higher than the values of the other Path Groups, I/O will continue within that Path Group. If not, I/O will switch to the Path Group with highest priority.

Create and map some LUNs to the host. If you are using the latest Qlogic or Emulex drivers, then run the respective utilities they provide to discover the LUN:

  • qla2xxx_lun_rescan all (QLogic)
  • lun_scan_all (Emulex)

To view a list of multipathed devices:

# multipath -d -l

[root@rhel-a ~]# multipath -l

[size=5 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active] \
\_ 2:0:0:0 sdc 8:32 [active]
\_ 3:0:0:0 sde 8:64 [active]
\_ round-robin 0 [enabled]
\_ 2:0:1:0 sdd 8:48 [active]
\_ 3:0:1:0 sdf 8:80 [active]

The above shows 1 LUN with 4 paths. Done. It's that easy to set up.