state of unicode support

Tim Bray · Aug 1, 2006

Daniel said:
I second that. I see a lot of people asking for "transparent" unicode
support but I don't see how that is possible. To me it's like asking for
a language that has transparent bug recovery. I know that ruby has
weaknesses when it comes to multibyte encodings, but the main problem is
human in nature; too many people assume that char==byte, which results
in bugs when someone unexpectedly uses "weird" characters. IMHO no
amount of "transparent support" will change that. But I would love to be
shown otherwise with examples of languages that "do it right".

It can be done. Java gets it almost right, and in such a way that most
people will never stub their toes on the flaws. Python, it seems, is
going to get it right next time around. It's clearly possible to do
Unicode correctly. What Matz wants is much harder; a String type that
can contain strings of characters from arbitrary character sets in
arbitrary encodings, Unicode being just one special case, and also serve
as a byte buffer.

-Tim

Python Unicode handling wins again -- mostly	67	Nov 30, 2013
Ruby 'C' Extensions and Unicode	10	Feb 9, 2010
Unicode questions	17	Oct 19, 2010
I'm tempted to quit out of frustration	1	Aug 13, 2023
Unicode Support in Ruby, Perl, Python, Emacs Lisp	6	Oct 7, 2010
How is unicode implemented behind the scenes?	4	Mar 9, 2014
JSON.parse and unicode escape?	3	Aug 27, 2008
Hi From Canada	3	Nov 26, 2023

state of unicode support

Tim Bray

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads