Heartbeat/DRBD failover didn't work as expected. How do I make the failover more robust?

Posted by Quinn Murphy on Server Fault See other posts from Server Fault or by Quinn Murphy
Published on 2012-06-19T20:09:05Z Indexed on 2012/06/19 21:18 UTC
Read the original article Hit count: 297

Filed under:

I had a scenario where a DRBD-heartbeat set up had a failed node but did not failover. What happened was the primary node had locked up, but didn't go down directly (it was inaccessible via ssh or with the nfs mount, but it could be pinged). The desired behavior would have been to detect this and failover to the secondary node, but it appears that since the primary didn't go full down (there is a dedicated network connection from server to server), heartbeat's detection mechanism didn't pick up on that and therefore didn't failover.

Has anyone seen this? Is there something that I need to configure to have more robust cluster failover? DRBD seems to otherwise work fine (had to resync when I rebooted the old primary), but without good failover, it's use is limited.

heartbeat 3.0.4
drbd84
RHEL 6.1
We are not using Pacemaker

nfs03 is the primary server in this setup, and nfs01 is the secondary.

ha.cf

  # Hearbeat Logging
logfacility daemon
udpport 694


ucast eth0 192.168.10.47
ucast eth0 192.168.10.42

# Cluster members
node nfs01.openair.com
node nfs03.openair.com

# Hearbeat communication timing.
# Sets the triggers and pulse time for swapping over.
keepalive 1
warntime 10
deadtime 30
initdead 120


#fail back automatically
auto_failback on

and here is the haresources file:

nfs03.openair.com   IPaddr::192.168.10.50/255.255.255.0/eth0      drbddisk::data  Filesystem::/dev/drbd0::/data::ext4 nfs nfslock

Developer IT

Heartbeat/DRBD failover didn't work as expected. How do I make the failover more robust? - Developer IT

Heartbeat/DRBD failover didn't work as expected. How do I make the failover more robust?

linux

redhat

drbd

heartbeat

Related posts about linux

apt-get install and update fail

kernel module compiling error

Build-Essentials installation failing

Updating Debian kernel

Serial connection over a single USB cable (Windows to linux, or linux to linux)

Related posts about redhat

Invalid configuration `noarch-redhat-linux-gnu': machine `noarch-redhat' not recognized

`# probe: true` in /etc/rc.d/init.d/* files on a RedHat system

Redhat Software RAID 1 not syncing

Failed to start X server. REDHAT LINUX 5.3

Problems installing Memcache (PECL extension)

Categories cloud