S
sln
I strongly disagree. Unicode has its weak points, but it is still
incomparably better that any scheme a Joe Xispack would invent
herself.... Witness the disaster with Emacs Internationalization.
Just:
existence of the notion of "Unicode character",
a possibility of specifying a character unambiguously (with some
minor hair-splitting needed sometimes, as in o-trema vs o-umlaut, or
in CJK), and
having a list of "property" *names* (which is, basically, the
information about how other people look at individual characters)
should be, IMO, an enormous help in the design of what you call
"manipulations". And I did not even touch "tables", i.e., the *values*
of these properties: it is a major work in itself...
Yours,
Ilya
Unicode is a nightmare. Encoding 1-6 bytes (or more) to represent the
whole range of possible multiple code rendering(s) of character(s) of all
the languages in the world is just out of control.
Internal data manipulation is a nightmare, a hog, and slow as hell.
Is it a byte, a word, int or more? 0 .. (2**32-1) or more! Optimizations?
Encode/Decoding, back and forth. Just a nightmare. And what is it, what
is the encoding of that? Dunno, take a guess! "L,that sucks man!";
Unicode, the expression of everything that does nothing (good).
-sln