state of unicode support

T

Tim Bray

Daniel said:
I second that. I see a lot of people asking for "transparent" unicode
support but I don't see how that is possible. To me it's like asking for
a language that has transparent bug recovery. I know that ruby has
weaknesses when it comes to multibyte encodings, but the main problem is
human in nature; too many people assume that char==byte, which results
in bugs when someone unexpectedly uses "weird" characters. IMHO no
amount of "transparent support" will change that. But I would love to be
shown otherwise with examples of languages that "do it right".

It can be done. Java gets it almost right, and in such a way that most
people will never stub their toes on the flaws. Python, it seems, is
going to get it right next time around. It's clearly possible to do
Unicode correctly. What Matz wants is much harder; a String type that
can contain strings of characters from arbitrary character sets in
arbitrary encodings, Unicode being just one special case, and also serve
as a byte buffer.

-Tim
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,209
Messages
2,571,088
Members
47,686
Latest member
scamivo

Latest Threads

Top