Sporadic crash of master-slave MySQL replication process

Posted by obarshay on Server Fault See other posts from Server Fault or by obarshay
Published on 2010-05-14T17:39:10Z Indexed on 2010/05/14 17:44 UTC
Read the original article Hit count: 448

Filed under:
|

Hello,

I was wondering if someone has experienced this and can perhaps provide some insight into this issue.

We have a plan-vanilla MySQL master-slave replication set up. The tables are MyISAM and the master can get quite read/write active. We use the slave instance to perform full daily backups in order to avoid bringing down the master server. The backup process does the following:

STOP SLAVE SQL_THREAD 
mysqlhotcopy all tables 
START SLAVE SQL_THREAD

Every once in a while (once a month or so) the replication breaks with varying error messages indicating a corrupt query or log file. Here's one that happened last night:

mysql> show slave status \G
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: server8.propreports.com
                Master_User: nexus8
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: bin.000045
        Read_Master_Log_Pos: 581644327
             Relay_Log_File: relay.000086
              Relay_Log_Pos: 94131
      Relay_Master_Log_File: bin.000045
           Slave_IO_Running: Yes
          Slave_SQL_Running: No
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 1064
                 Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '138070603'£' at line 1' on query. Default database: 'wtsdb'. Query: 'UPDATE fill SET clearing_fee='0.0E id='138070603'£'
               Skip_Counter: 0
        Exec_Master_Log_Pos: 4164743
            Relay_Log_Space: 577574251
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: NULL

I follow the following procedure to recover from above error and resume replication:

stop slave;
change master to MASTER_LOG_POS = 4164743, MASTER_LOG_FILE = 'bin.000045';
start slave;

We have multiple servers set up this way and they all sporadically stop replicating with a similar error. Any advice on how to resolve this would be greatly appreciated.

© Server Fault or respective owner

Related posts about mysql

Related posts about mysql-replication