How to query MySQL for exact length and exact UTF-8 characters

Posted by oskarae on Stack Overflow See other posts from Stack Overflow or by oskarae
Published on 2010-04-23T22:48:29Z Indexed on 2010/04/23 22:53 UTC
Read the original article Hit count: 272

Filed under:
|

I have table with words dictionary in my language (latvian).

CREATE TABLE words (
value varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

And let's say it has 3 words inside:
INSERT INTO words (value) VALUES ('teja');
INSERT INTO words (value) VALUES ('vejš');
INSERT INTO words (value) VALUES ('feja');

What I want to do is I want to find all words that is exactly 4 characters long and where second character is 'e' and third character is 'j'

For me it feels that correct query would be:
SELECT * FROM words WHERE value LIKE '_ej_';
But problem with this query is that it returs not 2 entries ('teja','vejš') but all three. As I understand it is because internally MySQL converts strings to some ASCII representation?

Then there is BINARY addition possible for LIKE
SELECT * FROM words WHERE value LIKE BINARY '_ej_';
But this also does not return 2 entries ('teja','vejš') but only one ('teja'). I believe this has something to do with UTF-8 2 bytes for non ASCII chars?

So question:
What MySQL query would return my exact two words ('teja','vejš')?

Thank you in advance

© Stack Overflow or respective owner

Related posts about mysql

Related posts about utf-8