D
Dustin Boyd
In draft N1570, the most recent draft of C11 AFAIK, I found something that might be a bug in the description of the c16rtomb function:
(Similar wording exists for the c32rtomb function.)
Why is a wide character constant of type wchar_t (L'\0') passed to a function expecting a wide character constant of type char16_t, especially when wchar_t may be an 8-bit or even a 64-bit type? After all, 7.20.3p4 basically states wchar_t must be at least an 8-bit integral type in its definitions of WCHAR_MIN and WCHAR_MAX, thus it may only be an 8-bit type. L'\0' would then be an 8-bit 0.
Of course wchar_t could be a type definition for unsigned long long as well, so L'\0' might be equivalent in value as well as magnitude to 0ULL. Sincechar16_t must be the same type as uint_least16_t, which may be a type definition for an integral type of precisely 16 bits, is there a reason for using a wide character constant of type wchar_t, which may be larger in magnitude?
The original TR 19769:2004 specification upon which this was based used thesame or nearly the same wording, including the example. Is this an oversight on the part of the working group in not using u'\0', or is it assumed that L'\0' will be the same as u'\0' with an implicit cast, which may cause truncation, despite the fact that the value remains the same, thereby allowing a compiler to issue a diagnostic? This is an edge case that seems to have been missed.
IBM uses the same example code at http://pic.dhe.ibm.com/infocenter/zos/v2r1/topic/com.ibm.zos.v2r1.bpxbd00/c16rtomb.htm
Were I to propose a rewording, I'd suggest the following:
And similarly for c32rtomb:
I guess my question is whether this is anything to worry about or simple pedantry? In practice, it may warrant a compiler diagnostic despite the fact that zero is zero, but in theory it doesn't matter *because* of the fact that zero is zero.
size_t c16rtomb(char * restrict s, char16_t c16, mbstate_t * restrict ps);
If s is a NULL pointer, the c16rtomb function is equivalent to the call
c16rtomb(buf, L'\0', ps)
where buf is an internal buffer.
(Similar wording exists for the c32rtomb function.)
Why is a wide character constant of type wchar_t (L'\0') passed to a function expecting a wide character constant of type char16_t, especially when wchar_t may be an 8-bit or even a 64-bit type? After all, 7.20.3p4 basically states wchar_t must be at least an 8-bit integral type in its definitions of WCHAR_MIN and WCHAR_MAX, thus it may only be an 8-bit type. L'\0' would then be an 8-bit 0.
Of course wchar_t could be a type definition for unsigned long long as well, so L'\0' might be equivalent in value as well as magnitude to 0ULL. Sincechar16_t must be the same type as uint_least16_t, which may be a type definition for an integral type of precisely 16 bits, is there a reason for using a wide character constant of type wchar_t, which may be larger in magnitude?
The original TR 19769:2004 specification upon which this was based used thesame or nearly the same wording, including the example. Is this an oversight on the part of the working group in not using u'\0', or is it assumed that L'\0' will be the same as u'\0' with an implicit cast, which may cause truncation, despite the fact that the value remains the same, thereby allowing a compiler to issue a diagnostic? This is an edge case that seems to have been missed.
IBM uses the same example code at http://pic.dhe.ibm.com/infocenter/zos/v2r1/topic/com.ibm.zos.v2r1.bpxbd00/c16rtomb.htm
Were I to propose a rewording, I'd suggest the following:
c16rtomb(buf, u'\0', ps)
And similarly for c32rtomb:
c32rtomb(buf, U'\0', ps)
I guess my question is whether this is anything to worry about or simple pedantry? In practice, it may warrant a compiler diagnostic despite the fact that zero is zero, but in theory it doesn't matter *because* of the fact that zero is zero.