RHEL Cluster FAIL after changing time on system

Posted by Eugene S on Server Fault See other posts from Server Fault or by Eugene S
Published on 2012-03-22T15:18:46Z Indexed on 2012/03/22 17:33 UTC
Read the original article Hit count: 327

Filed under:
|
|
|

I've encountered a strange issue. I had to change the time on my Linux RHEL cluster system. I've done it using the following command from the root user:

date +%T -s "10:13:13"

After doing this, some message appeared relating to <emerg> #1: Quorum Dissolved however I didn't capture it completely.

In order to investigate the issue I looked at /var/log/messages and I've discovered the following:

Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering GATHER state from 0.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Creating commit token because I am the rep.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Storing new sequence id for ring 354
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering COMMIT state.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering RECOVERY state.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] position [0] member 192.168.1.49:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] previous ring seq 848 rep 192.168.1.49
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] aru 61 high delivered 61 received flag 1
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Did not need to originate any messages in recovery.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Sending initial ORF token
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] CLM CONFIGURATION CHANGE
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] New Configuration:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] #011r(0) ip(192.168.1.49) 
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] Members Left:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] #011r(0) ip(192.168.1.51) 
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] Members Joined:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CMAN ] quorum lost, blocking activity
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] CLM CONFIGURATION CHANGE
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] New Configuration:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] #011r(0) ip(192.168.1.49) 
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] Members Left:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] Members Joined:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [SYNC ] This node is within the primary component and will provide service.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering OPERATIONAL state.
Mar 22 16:40:42 hsmsc50sfe1a kernel: dlm: closing connection to node 2
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] got nodejoin message 192.168.1.49
Mar 22 16:40:42 hsmsc50sfe1a clurgmgrd[25809]: <emerg> #1: Quorum Dissolved
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CPG  ] got joinlist message from node 1
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Cluster is not quorate.  Refusing connection.
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Error while processing connect: Connection refused
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Invalid descriptor specified (-21).
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Someone may be attempting something evil.
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Error while processing disconnect: Invalid request descriptor
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering GATHER state from 9.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Creating commit token because I am the rep.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Storing new sequence id for ring 358
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering COMMIT state.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering RECOVERY state.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] position [0] member 192.168.1.49:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] previous ring seq 852 rep 192.168.1.49
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] aru f high delivered f received flag 1
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] position [1] member 192.168.1.51:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] previous ring seq 852 rep 192.168.1.51
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] aru f high delivered f received flag 1
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Did not need to originate any messages in recovery.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Sending initial ORF token
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] CLM CONFIGURATION CHANGE
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] New Configuration:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] #011r(0) ip(192.168.1.49) 
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] Members Left:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] Members Joined:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] CLM CONFIGURATION CHANGE
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] New Configuration:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] #011r(0) ip(192.168.1.49) 
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] #011r(0) ip(192.168.1.51) 
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] Members Left:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] Members Joined:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] #011r(0) ip(192.168.1.51) 
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [SYNC ] This node is within the primary component and will provide service.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering OPERATIONAL state.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [MAIN ] Node chb_sfe2a not joined to cman because it has existing state
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] got nodejoin message 192.168.1.49
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM  ] got nodejoin message 192.168.1.51
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CPG  ] got joinlist message from node 1
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CPG  ] got joinlist message from node 2
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Cluster is not quorate.  Refusing connection.
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Error while processing connect: Connection refused
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Invalid descriptor specified (-111).
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Someone may be attempting something evil.
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Error while processing get: Invalid request descriptor
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Invalid descriptor specified (-21).
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Someone may be attempting something evil.
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Error while processing disconnect: Invalid request descriptor

How could this be related to the time change procedure I performed?

© Server Fault or respective owner

Related posts about linux

Related posts about cluster