DRBD Replication failure
- by user62513
A couple of weeks ago I setup a 2 nodes CRM system with one of the resources managed being MySQL over DRBD. Today for maintenance reasons I restarted both nodes but now they can't connect to each other anymore.
DRBD fell out of sync and I followed this guide to get it back connected but it's only able to run successfully on one node.
But this strange thing happens: If I crm node standby both nodes and I try:
crm node online node0 before crm node online node1, all the CRM resources start successfully but the DRBD partitions are still running in StandAlone state.
crm node online node1 beofre crm node online node0, the DRBD resource fails to start, thus causing mysql not to start.
If I standby both resources and call crm node online node0 then it times out and prints this error:
Running crm node online node0 produces this output after timing out
Error setting standby=off (section=nodes, set=<null>): Remote node did not respond
Error performing operation: Remote node did not respond
Is there anything I'm doing wrong here? An alternative will be just do MySQL replication but I'm not sure how to promote a slave to master when the master database is not available.