Code to strip diacritical marks using ICU
Posted
by Paul J. Lucas
on Stack Overflow
See other posts from Stack Overflow
or by Paul J. Lucas
Published on 2010-06-07T18:24:22Z
Indexed on
2010/06/07
21:52 UTC
Read the original article
Hit count: 351
Can somebody please provide some sample code to strip diacritical marks (i.e., replace characters having accents, umlauts, etc., with their unaccented, unumlauted, etc., character equivalents, e.g., every accented é
would become a plain ASCII e
) from a UnicodeString
using the ICU library in C++? E.g.:
UnicodeString strip_diacritics( UnicodeString const &s ) {
UnicodeString result;
// ...
return result;
}
Assume that s
has already been normalized. Thanks.
© Stack Overflow or respective owner