DRBD Replication failure

Posted by user62513 on Server Fault See other posts from Server Fault or by user62513
Published on 2011-02-08T14:16:39Z Indexed on 2011/02/08 15:27 UTC
Read the original article Hit count: 306

A couple of weeks ago I setup a 2 nodes CRM system with one of the resources managed being MySQL over DRBD. Today for maintenance reasons I restarted both nodes but now they can't connect to each other anymore.

DRBD fell out of sync and I followed this guide to get it back connected but it's only able to run successfully on one node.

But this strange thing happens: If I crm node standby both nodes and I try:

  • crm node online node0 before crm node online node1, all the CRM resources start successfully but the DRBD partitions are still running in StandAlone state.
  • crm node online node1 beofre crm node online node0, the DRBD resource fails to start, thus causing mysql not to start.
  • If I standby both resources and call crm node online node0 then it times out and prints this error:
    Running crm node online node0 produces this output after timing out 
    Error setting standby=off (section=nodes, set=<null>): Remote node did not respond
    Error performing operation: Remote node did not respond

Is there anything I'm doing wrong here? An alternative will be just do MySQL replication but I'm not sure how to promote a slave to master when the master database is not available.

© Server Fault or respective owner

Related posts about mysql

Related posts about replication