charsets in MySQL replication
- by niklassaers
Hi guys,
What can I do to ensure that replication will use latin1 instead of utf-8?
I'm migrating between an MySQL 5.1.22 server (master) on a Linux system and a MySQL 5.1.42 server (slave) on a FreeBSD system. My replication works well, but when non-ascii characters are in my varchars, they turn "weird". The Linux/MySQL-5.1.22 shows the following character set variables:
character_set_client=latin1
character_set_connection=latin1
character_set_database=latin1
character_set_filesystem=binary
character_set_results=latin1
character_set_server=latin1
character_set_system=utf8
character_sets_dir=/usr/share/mysql/charsets/
collation_connection=latin1_swedish_ci
collation_database=latin1_swedish_ci
collation_server=latin1_swedish_ci
While the FreeBSD shows
character_set_client=utf8
character_set_connection=utf8
character_set_database=utf8
character_set_filesystem=binary
character_set_results=utf8
character_set_server=utf8
character_set_system=utf8
character_sets_dir=/usr/local/share/mysql/charsets/
collation_connection=utf8_general_ci
collation_database=utf8_general_ci
collation_server=utf8_general_ci
Setting any of these variables from the MySQL CLI has no effect, and setting them in my.cnf or at the command line makes the server not start.
Of course, both servers have the tables in question created the same way, in this case with DEFAULT CHARSET=latin1. Let me give you an example:
CREATE TABLE `test` (
`test` varchar(5) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
When I on the master do, in a Latin1 terminal, "INSERT INTO test VALUES ('æøå')", this becomes on the slave, when I select it from a Latin1 based terminal
+--------+
| test |
+--------+
| æøå |
+--------+
On a UTF-8 based terminal on the replication slave, test contains:
+--------+
| test |
+--------+
| æøå |
+--------+
So my conclusion is that it is converted to utf8, even though the table definition is latin1. Is this a correct conclusion?
Of course, on the master, in a latin1 terminal, it still says:
+------+
| test |
+------+
| æøå |
+------+
Since both system character sets are utf-8, if I set both terminals to utf-8 and do again "INSERT INTO test VALUES ('æøå')" on the master with a utf-8 terminal, on the slave with utf-8 I get:
+------------+
| test |
+------------+
| æøà |
+------------+
If my conclusion is correct, all my replicated data is converted to utf8 (if it is utf8, it is treated as latin1 and converted to utf8), while all the old data in the table is, as the CREATE TABLE suggests, latin1. I'd love to convert it all to utf-8 if it weren't for the fact that legacy applications rely on it being latin1, so I need to keep it in latin1 while they still exist.
What can I do to ensure that the replication reads latin1, treats it as latin1 and writes it on the slave as latin1?
Cheers
Nik