From: "SpringFlowers AutumnMoon said:
ah it is not really about using UTF-8 in my program file... it is about
getting UTF-8 file listing on Vista and XP.
Oh. When you said:
to make some file have
international characters, it is really simple: can go to Google News
and look at news from China or Taiwan or Hong Kong, and then copy and
paste the text into a filename on Windows XP or Vista.
...I misunderstood and thought you meant pasting the characters into
your ruby source file. (I see now you were talking about a filename.)
Well OK - so I built the latest from the ruby 1.9.1 branch in subversion,
and attempted to have ruby read a directory containing a filename
with chinese characters, and then open and read the contents of the
file...
My script was:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (win32_unicode.rb) ~~~
# encoding: UTF-8
files = Dir["T:/zz/*.txt"]
x = files.first
p x, x.encoding
dat = open(x, "r:UTF-8") {|f| f.read}
p dat, dat.encoding
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The result was:
ruby19 win32_unicode.rb
"T:/zz/???????.txt"
#<Encoding:UTF-8>
win32_unicode.rb:8:in `initialize': Invalid argument - T:/zz/???????.txt (Errno::EINVAL)
from win32_unicode.rb:8:in `open'
from win32_unicode.rb:8:in `<main>'
I also tried with the -U flag and -E UTF-8 flag:
ruby19 -E UTF-8 win32_unicode.rb
"T:/zz/???????.txt"
#<Encoding:UTF-8>
win32_unicode.rb:8:in `initialize': Invalid argument - T:/zz/???????.txt (Errno::EINVAL)
from win32_unicode.rb:8:in `open'
from win32_unicode.rb:8:in `<main>'
ruby19 -U win32_unicode.rb
"T:/zz/???????.txt"
#<Encoding:UTF-8>
win32_unicode.rb:8:in `initialize': Invalid argument - T:/zz/???????.txt (Errno::EINVAL)
from win32_unicode.rb:8:in `open'
from win32_unicode.rb:8:in `<main>'
ruby19 -v
ruby 1.9.1p0 (2009-03-04) [i386-mswin32_71]
Note, it doesn't bother me that the filename displays as ???????.txt in the
command window, but rather the issue that ruby seems unable to open
a filename it just obtained via Dir[].
So, unless I have bungled my test somehow, it seems likely there is a
problem.
If so, as Ryan pointed out, we should move this to the ruby-core list.
Regards,
Bill