We had a power outage at our data center last week and when our dual PIX 515E running IOS 7.0(8) (configured with a failover cable) came back, they were in a failed over state where the Secondary unit is active and the Primary unit is standby I have tried 'failover reset', 'failover active', and 'failover reload-standby' as well as executing reloads on both units in a variety of orders, and they don't come back Primary/Active Secondary/Standby. The only thing in my arsenal that I haven't tried is driving to the data center and performing a hard reboot, which I hate to do.
I have read How Failover Works on the Cisco Secure Firewall and it seems like this should be wicked straight forward.
output of show failover on Primary:
Failover On
Cable status: Normal
Failover unit Primary
Failover LAN Interface: N/A - Serial-based failover enabled
Unit Poll frequency 15 seconds, holdtime 45 seconds
Interface Poll frequency 15 seconds
Interface Policy 1
Monitored Interfaces 2 of 250 maximum
Version: Ours 7.0(8), Mate 7.0(8)
Last Failover at: 02:52:05 UTC Mar 10 2010
This host: Primary - Standby Ready
Active time: 0 (sec)
Interface outside (x.x.x.165): Normal
Interface inside (y.y.y.3): Normal
Other host: Secondary - Active
Active time: 897045 (sec)
Interface outside (x.x.x.164): Normal
Interface inside (y.y.y.4): Normal
Stateful Failover Logical Update Statistics
Link : Unconfigured.
output of show failover on Secondary:
Failover On
Cable status: Normal
Failover unit Secondary
Failover LAN Interface: N/A - Serial-based failover enabled
Unit Poll frequency 15 seconds, holdtime 45 seconds
Interface Poll frequency 15 seconds
Interface Policy 1
Monitored Interfaces 2 of 250 maximum
Version: Ours 7.0(8), Mate 7.0(8)
Last Failover at: 02:03:04 UTC Feb 28 2010
This host: Secondary - Active
Active time: 896925 (sec)
Interface outside (x.x.x.164): Normal
Interface inside (y.y.y.4): Normal
Other host: Primary - Standby Ready
Active time: 0 (sec)
Interface outside (x.x.x.165): Normal
Interface inside (y.y.y.3): Normal
Stateful Failover Logical Update Statistics
Link : Unconfigured.
I'm seeing the following in my syslog:
Mar 10 03:05:00 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command.
Mar 10 03:05:09 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reload-standby' command.
Mar 10 03:05:12 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=20,my=Active,peer=Failed.
Mar 10 03:05:12 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Failed.
Mar 10 03:06:09 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=0,my=Active,peer=Failed.
Mar 10 03:06:09 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is down.
Mar 10 03:06:09 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=1,my=Active,peer=Failed.
Mar 10 03:06:10 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is up.
Mar 10 03:06:10 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=411,op=2,my=Active,peer=Failed.
Mar 10 03:06:23 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=80,my=Active,peer=Standby Ready.
Mar 10 03:06:23 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Standby Ready.
Mar 10 03:06:24 fw2 %PIX-6-720027: (VPN-Primary) HA status callback: My state Standby Ready.
Mar 10 03:07:05 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command.
Mar 10 03:07:31 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover active' command.
Mar 10 03:08:04 fw1 %PIX-5-611103: User logged out: Uname: enable_1
Mar 10 03:08:04 fw1 %PIX-6-315011: SSH session from admin1_int on interface inside for user "pix" terminated normally
Mar 10 03:08:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=20,my=Active,peer=Failed.
Mar 10 03:08:39 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Failed.
Mar 10 03:09:10 fw1 %PIX-6-605005: Login permitted from admin1_int/36891 to inside:192.168.4.4/ssh for user "pix"
Mar 10 03:09:23 fw1 %PIX-5-111008: User 'enable_15' executed the 'failover reset' command.
Mar 10 03:09:38 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=0,my=Active,peer=Failed.
Mar 10 03:09:39 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is down.
Mar 10 03:09:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=401,op=1,my=Active,peer=Failed.
Mar 10 03:09:39 fw1 %PIX-6-720024: (VPN-Secondary) HA status callback: Control channel is up.
Mar 10 03:09:39 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=411,op=2,my=Active,peer=Failed.
Mar 10 03:09:52 fw1 %PIX-6-720032: (VPN-Secondary) HA status callback: id=3,seq=200,grp=0,event=406,op=80,my=Active,peer=Standby Ready.
Mar 10 03:09:52 fw1 %PIX-6-720028: (VPN-Secondary) HA status callback: Peer state Standby Ready.
Mar 10 03:09:53 fw2 %PIX-6-720027: (VPN-Primary) HA status callback: My state Standby Ready.
I'm not exactly sure how to interpret that syslog data. Primary doesn't seem to even try to become Active. When I reload the individual units separately, my connections are retained, so it doesn't seem like I have a real hardware failure. Is there something I can query (IOS or SNMP) to check for hardware issues?
Any thoughts? My IOS-fu is weak.
Thanks for any help you might provide,
Aaron