RHEL Cluster FAIL after changing time on system
- by Eugene S
I've encountered a strange issue.
I had to change the time on my Linux RHEL cluster system. I've done it using the following command from the root user:
date +%T -s "10:13:13"
After doing this, some message appeared relating to <emerg> #1: Quorum Dissolved however I didn't capture it completely.
In order to investigate the issue I looked at /var/log/messages and I've discovered the following:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering GATHER state from 0.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Creating commit token because I am the rep.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Storing new sequence id for ring 354
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering COMMIT state.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering RECOVERY state.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] position [0] member 192.168.1.49:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] previous ring seq 848 rep 192.168.1.49
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] aru 61 high delivered 61 received flag 1
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Did not need to originate any messages in recovery.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Sending initial ORF token
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] CLM CONFIGURATION CHANGE
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] New Configuration:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] #011r(0) ip(192.168.1.49)
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] Members Left:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] #011r(0) ip(192.168.1.51)
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] Members Joined:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CMAN ] quorum lost, blocking activity
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] CLM CONFIGURATION CHANGE
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] New Configuration:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] #011r(0) ip(192.168.1.49)
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] Members Left:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] Members Joined:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [SYNC ] This node is within the primary component and will provide service.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering OPERATIONAL state.
Mar 22 16:40:42 hsmsc50sfe1a kernel: dlm: closing connection to node 2
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] got nodejoin message 192.168.1.49
Mar 22 16:40:42 hsmsc50sfe1a clurgmgrd[25809]: <emerg> #1: Quorum Dissolved
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CPG ] got joinlist message from node 1
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Cluster is not quorate. Refusing connection.
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Error while processing connect: Connection refused
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Invalid descriptor specified (-21).
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Someone may be attempting something evil.
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Error while processing disconnect: Invalid request descriptor
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering GATHER state from 9.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Creating commit token because I am the rep.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Storing new sequence id for ring 358
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering COMMIT state.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering RECOVERY state.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] position [0] member 192.168.1.49:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] previous ring seq 852 rep 192.168.1.49
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] aru f high delivered f received flag 1
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] position [1] member 192.168.1.51:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] previous ring seq 852 rep 192.168.1.51
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] aru f high delivered f received flag 1
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Did not need to originate any messages in recovery.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] Sending initial ORF token
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] CLM CONFIGURATION CHANGE
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] New Configuration:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] #011r(0) ip(192.168.1.49)
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] Members Left:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] Members Joined:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] CLM CONFIGURATION CHANGE
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] New Configuration:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] #011r(0) ip(192.168.1.49)
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] #011r(0) ip(192.168.1.51)
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] Members Left:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] Members Joined:
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] #011r(0) ip(192.168.1.51)
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [SYNC ] This node is within the primary component and will provide service.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [TOTEM] entering OPERATIONAL state.
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [MAIN ] Node chb_sfe2a not joined to cman because it has existing state
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] got nodejoin message 192.168.1.49
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CLM ] got nodejoin message 192.168.1.51
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CPG ] got joinlist message from node 1
Mar 22 16:40:42 hsmsc50sfe1a openais[25715]: [CPG ] got joinlist message from node 2
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Cluster is not quorate. Refusing connection.
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Error while processing connect: Connection refused
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Invalid descriptor specified (-111).
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Someone may be attempting something evil.
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Error while processing get: Invalid request descriptor
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Invalid descriptor specified (-21).
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Someone may be attempting something evil.
Mar 22 16:40:42 hsmsc50sfe1a ccsd[25705]: Error while processing disconnect: Invalid request descriptor
How could this be related to the time change procedure I performed?