Hi,
As you suggested I read the article:
http://www.joelonsoftware.com/articles/Unicode.html
I didn't find anything new. It's just explaining character sets in a
rather non-specific way. ASCII uses 8 bits, so it can store 256
characters, so it can't store all the characters in the world, so
other character sets are needed (really? I would have never guessed
that). UTF-16 basically stores characters in 2 bytes (that means more
characters in the world), UTF-8 also allows more characters it doesn't
necessarily needs 2 bytes, it uses 1, and if the character is beyond
127 then it will use 2 bytes. This whole thing can be extended up to 6
bytes.
So what exactly am I looking for here?
What is there to know about Unicode? There's a couple of character
sets, use UTF-8, and remember that one character != one byte. Is there
anything else for practical purposes?
I'm sorry if I'm being rude, but I really don't like when people tell
me to read stuff I already know.
My question is still there:
Let's say I want to rename a file "fooobar", and remove the third "o",
but it's UTF-16, and Ruby only supports UTF-8, so I remove the "o" and
of course there will still be a 0x00 in there. That's if the string is
recognized at all.
Why is there no issue with UTF-16 if only UTF-8 is supported?
I don't mind reading some more if I can actually find the answer.
Best regards.