Monday, October 22, 2007

Solaris 10 iSCSI configured with Dynamic Discovery

Recently we went thru re-IPing all of our servers and storage arrays in our office. For the most part everything went fine with the exception of a Solaris 10 U3 server I was running iSCSI on.

After I got thru the steps of changing the server's IP address, gateway and DNS entries I rebooted the server. Upon reboot, I noticed a flurry of non-stop error messages at the server's console:

Sep 30 18:37:37 longhorn iscsi: [ID 286457 kern.notice] NOTICE: iscsi connection(8) unable to connect to target SENDTARGETS_DISCOVERY (errno:128)Sep 30 18:37:37 longhorn iscsi: [ID 114404 kern.notice] NOTICE: iscsi discovery failure - SendTargets (0xx.0xx.0xx.0xx)

As a result of this, I was never able to get a login prompt either at the console or via telnet even though I could succesfuly ping the server's new IP address. What the message above indicates is that the initiator issues a SendTargets and waits for the Target to respond with its Targets. To my surprise there's NO timeout and the initiator will try this process indefinately. In fact, just for kicks, I left it trying for an hour and 45'.

That also means that you will be locked out of the server, as attempting to boot into single user mode results in the exact same behavior.

To get around this problem you have 2 options even though option #2, for some, may not be an option.

Option 1
--------
a) Boot from a Solaris cdrom
b) mount /dev/dsk/c#t#d#s0 /a
c) cd /a/etc/iscsi
d) Remove or rename *.dbc and *.dbp files (iscsi not configured any longer)
e) Reboot the server
f) Use iscsiadm and configure the Solaris server with Static discovery (static-config) so you don't get into this situation again

Option 2
---------
a) Change back to the old Target IP address
b) That will enable you to reboot the server
c) Reconfigure the server to use static-config by specifying the target-name, new Target-ip-address and port-number
d) Change the Target IP address to the new one

I followed Option #1 because #2 was not really not an option for us. So the morale of the story is that you may want to consider static-discovery on Solaris with iSCSI.






4 comments:

Anonymous said...

Did you tried boot -m milestone=none from the ok prompt?

This boot the box at very early stage, so you can repair things (earlier than boot -s).

Alex

Nick Triantos said...

Great tip Alex. Thanks a lot. No, I didn't try to boot using this option.

Anonymous said...

Hi hopefully you'll be able to help me with an iscsi issue.

I'm more or less a beginner with Solaris (Usually do Aix Work), however I been brought in to this company to perform a TSM Implementation.

I've figured out how Solaris interpret device addresses now and for one of our site where SAN is configured I’ve installed and introduced the tape drives/libraries to TSM.

For some of the other site SAN is not an option, hence why we're forced to use iscsi as an option to introduce the VTL (Virtual Tape Library)

Two libraries have been presented to the TSM server running Solaris 10, one physical and one VTL. The PTL (physical tape library) was dead easy to configure where the target ID came up with 1 and 2 and they used LUN 0, however for the iscsi one it appears to be different.

I cannot figure out the combination of the target ID in relation to the LUN.

A list under /dev/rmt/ gives me this information (btw only one VTL drive has been created)

lrwxrwxrwx 1 root root 56 Dec 10 13:45 /dev/rmt/0mn -> ../../devices/pci@780/pci@0/pci@8/pci@0/scsi@8/st@1,0:mn
lrwxrwxrwx 1 root root 56 Dec 10 13:45 /dev/rmt/3mn -> ../../devices/pci@7c0/pci@0/pci@8/pci@0/scsi@8/st@2,0:mn
lrwxrwxrwx 1 root root 82 Dec 10 14:06 /dev/rmt/5mn -> ../../devices/iscsi/tape@0000iqn.1995-03.com.quantum%3Acx0738akr01067-ep0FFFF,1:mn

0mn and 3mn represent the PTL drives and 5mn the VTL drive.


root@GIBPSSFSWBK001 PROD # iscsiadm list target -S iqn.1995-03.com.quantum:cx0738akr01067-ep0
Target: iqn.1995-03.com.quantum:cx0738akr01067-ep0
Alias: -
TPGT: 0
ISID: 4000002a0000
Connections: 1
LUN: 1
Vendor: IBM
Product: ULT3580-TD3
OS Device Name: /dev/rmt/5

iscsiadmin gives me this information but I can still not work out the relationship.

Nick Triantos said...

Percy,

I know what you're asking and I know where you're going with this. Basically, you expect to see a simple TARGET ID which you then can use in the mt.conf along with the LUN ID.

Solaris 10 with the native initiator will not give you this. I remember a year ago someone on opensolaris.org had the exact same problem trying to configure TSM on Solaris 10 U2 with an Overland REO device and had the exact same issue. Don't believe he was supplied with an answer.

I do know though that Qlogic's iSCSI driver implementation used with the QLA40xx iSCSI HBA is identical to the FC implementation and the output will list the Target ID as you'd expect it.

Scanning thru Quantum's compatibility guide, which leaves a lot to be desired, I didn't see any Solaris 10 native iSCSI support with or without TSM. I did see several references to iSCSI on Solaris with the QLA4010 cards (older cards). There maybe something to it...