How do you store accented characters coming from a web service into a database?
- by Thierry Lam
I have the following word that I fetch via a web service: André
From Python, the value looks like: "Andr\u00c3\u00a9".  The input is then decoded using json.loads:
>>> import json
>>> json.loads('{"name":"Andr\\u00c3\\u00a9"}')
>>> {u'name': u'Andr\xc3\xa9'}
When I store the above in a utf8 MySQL database, the data is stored like the following using Django:
SomeObject.objects.create(name=u'Andr\xc3\xa9')
Querying the name column from a mysql shell or displaying it in a web page gives:
André
The web page displays in utf8:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
My database is configured in utf8:
mysql> SHOW VARIABLES LIKE 'collation%';
+----------------------+-----------------+
| Variable_name        | Value           |
+----------------------+-----------------+
| collation_connection | utf8_general_ci | 
| collation_database   | utf8_unicode_ci | 
| collation_server     | utf8_unicode_ci | 
+----------------------+-----------------+
3 rows in set (0.00 sec)
mysql> SHOW VARIABLES LIKE 'character_set%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       | 
| character_set_connection | utf8                       | 
| character_set_database   | utf8                       | 
| character_set_filesystem | binary                     | 
| character_set_results    | utf8                       | 
| character_set_server     | utf8                       | 
| character_set_system     | utf8                       | 
| character_sets_dir       | /usr/share/mysql/charsets/ | 
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
How can I retrieve the word André from a web service, store it properly in a database with no data loss and display it on a web page in its original form?