T
Tim Bray
Daniel said:I second that. I see a lot of people asking for "transparent" unicode
support but I don't see how that is possible. To me it's like asking for
a language that has transparent bug recovery. I know that ruby has
weaknesses when it comes to multibyte encodings, but the main problem is
human in nature; too many people assume that char==byte, which results
in bugs when someone unexpectedly uses "weird" characters. IMHO no
amount of "transparent support" will change that. But I would love to be
shown otherwise with examples of languages that "do it right".
It can be done. Java gets it almost right, and in such a way that most
people will never stub their toes on the flaws. Python, it seems, is
going to get it right next time around. It's clearly possible to do
Unicode correctly. What Matz wants is much harder; a String type that
can contain strings of characters from arbitrary character sets in
arbitrary encodings, Unicode being just one special case, and also serve
as a byte buffer.
-Tim