J
Jerry Coffin
Having char being a UTF8 Unicode type, what happens to implementations
that support EBCDIC characters and char holds EBCDIC characters?
Char is not a UTF-8 Unicode type. Rather, it is a type that is
guaranteed to be at least 8 bits, so it can _hold_ UTF-8 data -- but it
can also hold whatever other data you prefer that will fit.
On an EBCDIC machine, you'd create an EBCDIC string just like you always
have:
char x[] = "abcdef";
but if you want a UTF-8 string, you can do:
char x[] = u8"abcdef";
The former uses whatever character set has been chosen by the
implementation. The latter is guaranteed to use UTF-8 encoding (or,
under the as-if rule, is at least guaranteed to act like it did...)