What encoding does c32rtomb convert to?
Posted
by
R. Martinho Fernandes
on Stack Overflow
See other posts from Stack Overflow
or by R. Martinho Fernandes
Published on 2012-10-24T08:44:09Z
Indexed on
2012/10/27
5:03 UTC
Read the original article
Hit count: 124
The functions c32rtomb
and mbrtoc32
from <cuchar>
/<uchar.h>
are described in the C Unicode TR (draft) as performing conversions between UTF-321 and "multibyte characters".
(...) If
s
is not a null pointer, thec32rtomb
function determines the number of bytes needed to represent the multibyte character that corresponds to the wide character given byc32
(including any shift sequences), and stores the multibyte character representation in the array whose first element is pointed to bys
. (...)
What is this "multibyte character representation"? I'm actually interested in the behaviour of the following program:
#include <cassert>
#include <cuchar>
#include <string>
int main() {
std::u32string u32 = U"this is a wide string";
std::string narrow = "this is a wide string";
std::string converted(1000, '\0');
char* ptr = &converted[0];
std::mbstate_t state {};
for(auto u : u32) {
ptr += std::c32rtomb(ptr, u, &state);
}
converted.resize(ptr - &converted[0]);
assert(converted == narrow);
}
Is the assertion in it guaranteed to hold1?
1 Working under the assumption that __STDC_UTF_32__
is defined.
© Stack Overflow or respective owner