Code to strip diacritical marks using ICU

Posted by Paul J. Lucas on Stack Overflow See other posts from Stack Overflow or by Paul J. Lucas
Published on 2010-06-07T18:24:22Z Indexed on 2010/06/07 21:52 UTC
Read the original article Hit count: 351

Filed under:
|
|
|

Can somebody please provide some sample code to strip diacritical marks (i.e., replace characters having accents, umlauts, etc., with their unaccented, unumlauted, etc., character equivalents, e.g., every accented é would become a plain ASCII e) from a UnicodeString using the ICU library in C++? E.g.:

UnicodeString strip_diacritics( UnicodeString const &s ) {
    UnicodeString result;
    // ...
    return result;
}

Assume that s has already been normalized. Thanks.

© Stack Overflow or respective owner

Related posts about c++

Related posts about unicode