So after upgrading to CentOS 6.2, I am seemingly no longer able to login into my iSCSI targets. I have multiple interfaces on different subnets on the system, and I first thought that it had to do with the fact that I may not be binding correct interfaces, which seems to be the case when looking at netstat, as this is clearly wrong:
[root]? netstat -na|grep .90
tcp 0 1 10.10.100.60:42354 10.10.8.90:3260 SYN_SENT
tcp 0 1 10.10.100.60:40777 10.10.9.90:3260 SYN_SENT
I then went ahead and disabled all but one interface, and so as a result netstat appears to be correct, but the issue with login remains. I am positive that the target never sees a packet, because I see nothing by SYN_SENT. I know the problem is on my client, because the target is servicing multiple systems, none of which are CentOS 6.2. At this point I am pretty confident that some things changed between CentOS 6.0/6.1 and 6.2. So, if anyone have any thoughts, or ran into this, I would very much like to hear your thoughts.
[root]? iscsiadm --mode node --targetname iqn.2011-12.dom.homer:01:lab-centos-servers-00001 --portal 10.10.8.90:3260,2 --interface=sw-iscsi-0 --login
Logging in to [iface: sw-iscsi-0, target: iqn.2011-12.dom.homer:01:lab-centos-servers-00001, portal: 10.10.8.90,3260] (multiple)
iscsiadm: Could not login to [iface: sw-iscsi-0, target: iqn.2011-12.dom.homer:01:lab-centos-servers-00001, portal: 10.10.8.90,3260].
iscsiadm: initiator reported error (8 - connection timed out)
iscsiadm: Could not log into all portals
[root]? netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
10.10.8.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2.7
10.10.9.0 0.0.0.0 255.255.255.0 U 0 0 0 eth3.7
10.10.100.0 0.0.0.0 255.255.252.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth1
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth2
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth3
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth2.7
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth3.7
0.0.0.0 10.10.100.1 0.0.0.0 UG 0 0 0 eth0
Output of ip addr show for the two interfaces involved:
[root]? for i in 2.7 3.7; do ip addr show eth$i; done
6: eth2.7@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 00:0c:29:94:5b:8d brd ff:ff:ff:ff:ff:ff
inet 10.10.8.60/24 brd 10.10.8.255 scope global eth2.7
inet6 fe80::20c:29ff:fe94:5b8d/64 scope link
valid_lft forever preferred_lft forever
7: eth3.7@eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 00:0c:29:94:5b:97 brd ff:ff:ff:ff:ff:ff
inet 10.10.9.60/24 brd 10.10.9.255 scope global eth3.7
inet6 fe80::20c:29ff:fe94:5b97/64 scope link
valid_lft forever preferred_lft forever
Update 01/06/2012:
This issue is getting even more interesting by the day it seems. I went a few weeks back and grabbed a snapshot of this system from before upgrading to 6.2. I spun up a new system from the snapshot, and reconfigured interface info and host keys, as well as iSCSI initiator and iscsi interface info to match new MACs. Changed nothing else.
Then, I attempted to connect to my targets, and no issues at all. I cannot say that this was unexpected. I then went ahead and compared sysctl settings from both systems and there were differences after the upgrade, but nothing seemingly relevant to iSCSI or IP that could contribute to this. I also noticed that by default now two sessions per connection were enabled after the upgrade, but I changed it back to 1 session in /etc/iscsi/iscsid.conf.
On the problematic system we can see that source interface is seemingly wrong, but even when I disable the 10.10.100 interface, problems persist. So, while this may be relevant, I could not validate it for certain. Needless to say, further research is necessary. Something is clearly different between releases. Working system is on 6.1, and non-working is 6.2.
::Working System::
tcp 0 0 10.10.8.210:39566 10.10.8.90:3260 ESTABLISHED
tcp 0 0 10.10.9.210:46518 10.10.9.90:3260 ESTABLISHED
[root]? ip route show
10.10.8.0/24 dev eth2.6 proto kernel scope link src 10.10.8.210
10.10.9.0/24 dev eth3.7 proto kernel scope link src 10.10.9.210
10.10.100.0/22 dev eth0 proto kernel scope link src 10.10.100.210
169.254.0.0/16 dev eth0 scope link metric 1002
169.254.0.0/16 dev eth2.6 scope link metric 1006
169.254.0.0/16 dev eth3.7 scope link metric 1007
default via 10.10.100.1 dev eth0
::Non-working System::
tcp 0 1 10.10.100.60:44737 10.10.9.90:3260 SYN_SENT
tcp 0 1 10.10.100.60:55479 10.10.8.90:3260 SYN_SENT
[root]? ip route show
10.10.8.0/24 dev eth2.6 proto kernel scope link src 10.10.8.60
10.10.9.0/24 dev eth3.7 proto kernel scope link src 10.10.9.60
10.10.100.0/22 dev eth0 proto kernel scope link src 10.10.100.60
169.254.0.0/16 dev eth0 scope link metric 1002
169.254.0.0/16 dev eth2.6 scope link metric 1006
169.254.0.0/16 dev eth3.7 scope link metric 1007
default via 10.10.100.1 dev eth0
And the result is still same:
[root]? iscsiadm: Could not login to [iface: sw-iscsi-0, target: iqn.2011-12.dom.homer:01:lab-centos-servers-00001, portal: 10.10.8.90,3260].
iscsiadm: initiator reported error (8 - connection timed out)
iscsiadm: Could not login to [iface: sw-iscsi-1, target: iqn.2011-12.dom.homer:02:lab-centos-servers-00001, portal: 10.10.9.90,3260].
iscsiadm: initiator reported error (8 - connection timed out)
iscsiadm: Could not log into all portals
Update 01/08/2012:
I believe I have been able to figure out the answer to my issue. It is quite obscure and I doubt this will happen to anyone else any time soon. It turns out that setting iface.iscsi_ifacename and iface.hwaddress in the interfaces configuration file is not legal. When one manually adds an iscsi target, such as below, all settings from the interface config file are copied into the node config file, that gets created by the below command. Result is parameters iface.iscsi_ifacename and iface.hwaddress together in the same config file. These parameters are seemingly mutually exclusive, which does not exactly make sense, or there is perhaps an oversight in the codepath. Perhaps I will investigate further.
# iscsiadm -m node --op new -T iqn.2011-12.dom.homer:01:lab-centos-servers-00001 -p 10.10.8.90,3260,2 -I sw-iscsi-0
# iscsiadm -m node --op new -T iqn.2011-12.dom.homer:02:lab-centos-servers-00001 -p 10.10.9.90,3260,2 -I sw-iscsi-1
Notice, below I commented out iface.hwaddress and iface.ipaddress, after which I re-added targets, with same command as above. All works just fine.
[root]? cat *
# BEGIN RECORD 2.0-872.33.el6
iface.iscsi_ifacename = sw-iscsi-0
iface.net_ifacename = eth2.6
#iface.hwaddress = XX:XX:XX:XX:XX:XX
#iface.ipaddress = 10.10.8.60
iface.transport_name = tcp
iface.vlan_id = 6
iface.vlan_priority = 0
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
# END RECORD
# BEGIN RECORD 2.0-872.33.el6
iface.iscsi_ifacename = sw-iscsi-1
iface.net_ifacename = eth3.7
#iface.hwaddress = XX:XX:XX:XX:XX:XX
#iface.ipaddress = 10.10.9.60
iface.transport_name = tcp
iface.vlan_id = 7
iface.vlan_priority = 0
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
# END RECORD
Again, chances of this happening to someone else are slim to none, so likely waste of time typing this up. But, if someone does encounter this issue, I hope this post will help.