What is the purpose of the s==NULL case for mbrtowc?
Posted
by
R..
on Stack Overflow
See other posts from Stack Overflow
or by R..
Published on 2011-01-17T02:18:34Z
Indexed on
2011/01/17
6:53 UTC
Read the original article
Hit count: 280
mbrtowc
is specified to handle a NULL
pointer for the s
(multibyte character pointer) argument as follows:
If s is a null pointer, the mbrtowc() function shall be equivalent to the call:
mbrtowc(NULL, "", 1, ps)
In this case, the values of the arguments pwc and n are ignored.
As far as I can tell, this usage is largely useless. If ps
is not storing any partially-converted character, the call will simply return 0 with no side effects. If ps
is storing a partially-converted character, then since '\0'
is not valid as the next byte in a multibyte sequence ('\0'
can only be a string terminator), the call will return (size_t)-1
with errno==EILSEQ
. and leave ps
in an undefined state.
The intended usage seems to have been to reset the state variable, particularly when NULL
is passed for ps
and the internal state has been used, analogous to mbtowc
's behavior with stateful encodings, but this is not specified anywhere as far as I can tell, and it conflicts with the semantics for mbrtowc
's storage of partially-converted characters (if mbrtowc
were to reset state when encountering a 0 byte after a potentially-valid initial subsequence, it would be unable to detect this dangerous invalid sequence).
If mbrtowc
were specified to reset the state variable only when s
is NULL
, but not when it points to a 0 byte, a desirable state-reset behavior would be possible, but such behavior would violate the standard as written. Is this a defect in the standard? As far as I can tell, there is absolutely no way to reset the internal state (used when ps
is NULL
) once an illegal sequence has been encountered, and thus no correct program can use mbrtowc
with ps==NULL
.
© Stack Overflow or respective owner