I mean long long is merely introduced because C committee decided to
introduce it to C99, no other real reason. What will happen if they
decide in the future to add another such built-in type?
The following version of C++ will almost certainly add it as
well. I'm pretty sure that there is a strong consensus to keep
C++ compatible with C with regards to the integer types.
Note that the C committee wasn't particularly happy with long
long itself. After all, what happens if the next generation of
machines also supports 128 bit types: we all "long long long"?
They accepted it as "wide-spread existing practice", but at the
same time, developped a more general framework for an unlimited
number of integral types. C++ has also adopted this framework:
there is no guarantee that long long is the longest integral
type in a given implementation. (That would be intmax_t, which
is a typedef.)
Those implementations you are mentioning are compiling
programs for OSes that do provide Unicode?
What does "provide Unicode" mean? I use Unicode under Solaris.
Sun CC generates some other encoding for wide string literals,
and G++ only allows basic ASCII in them to begin with (otherwise
"converting to execution character set: Illegal byte sequence"
For that matter, I get the same error with g++ under Linux.
Under Windows I suppose current VC++ implements wchar_t as
Unicode, and in my OS (Linux) I suppose wchar_t is Unicode
(haven't verified the last though).
Under Windows, wchar_t nominally is UTF-16. But of course, it's
really whatever the code interpreting it interprets it to be.
Under Linux, as far as I can tell, there is no nominal
encoding---it's whatever the program wants it to be. (The
difference, of course, is that Linux doesn't support any wchar_t
IO, so any wchar_t is purely intern to the program.)
A quick test on my machines showed that g++ doesn't support
UTF-32 (which would be the normal Unicode format for the 4 byte
wchar_t), at least in wide string literals, so I don't see how
you can say that it supports Unicode. I haven't tried things
like "toupper( L'\u00E9', std::locale() )", so I don't know
about those, but they're locale dependent anyway.
So with these new character types will we get Unicodes under
OSes that do not support Unicode?
Presumably. The problem isn't really OS support---most OS's are
encoding neutral for most functions. (With a few
exceptions---Posix/Linux pretty much requires that the native
narrow character encoding be a superset of ASCII. But in fact,
about the only place I think that this will be an issue is for
'/', and maybe a few other special separators.)
With the introduction of these new types, what will be the use
of wchar_t?
Support for legacy code. Support for the native 32 bit
encoding, which isn't Unicode under Solaris (nor, I think, most
other Posix systems). Support for whatever the implementation
wants---that's what it currently is.
Essentially I am talking about restricting the introduction of
new features in the new standard, only to the most essential
ones. I have the feeling that all these Unicodes will be
messy.
Well, I won't argue against you there. Having to deal with so
many different encodings and encoding formats is messy. The
problem is that the mess is there, outside of C++, and we have
to deal with it in one way or another.
Why are all these Unicode types needed?
To support all of the formats Unicode standardizes.
After a new version of Unicode, we will have it introduced as
a new built-in type in C++ standard?
If they introduce still more encoding formats, I suppose yes.
Somehow, I don't see that happening.
What will be the use of the old ones? What I am saying is that
we will be having an continuous accumulation of older built-in
character types.
We are repeating C's mistakes here, adding built in types
instead of providing them as libraries.
There's only so much you can do in a library. You can't make a
new integral type, which behaves like an integral type.
Of course, I'm not sure that that's really what is needed for
the Unicode types. Do you really want to be able to increment a
character (as opposed to a small integral value). But again,
character types are integral types in C, C++ wants to be
compatible with C with regards to the integral types, and C
won't use a library for the basic type here. (I'm not sure, but
I believe that char32_t and char16_t also originate in a TR for
C.)
For the other integral types: the language wants to support what
the hardware supports.