Unicode in Ruby and a Ruby Reference

M

Mike McGavin

Hi everyone.

I'm making my second big attempt in getting into Ruby, and I have a
couple of questions. I hope they don't sound too trivial.

1. I was wondering what the state is of Ruby and support for Unicode?
For instance, I'm coming mostly from Python which has a special
Unicode type that can be translated to various encodings on request.
I can't seem to find anything similar in Ruby. Does it exist
anywhere, or is it standard to deal with Unicode in a completely
different way, or is it something that hasn't been developed at this
point?

2. What are the most definitive references for Ruby and the standard
libraries that are available? I've found the reference at RubyCentral
to be very helpful (http://www.rubycentral.com/ref/), but it also
seems to be missing things here and there. On the other hand, it's
possible that I'm completely mis-reading it.

For instance, I found out about the Singleton module completely by
chance while reading this group. It certainly appears to work in my
Ruby 1.8.1 interpreter, but I can't seem to find it formally described
anywhere. I know what I've seen of it, but I don't know what else it
might have to it. I also wonder about all of the other things I could
be missing out on.


Thanks in advance for the help. I really like Ruby as a language and
I hope I'll be able to use it for some things later on. I'm just
interested to find out if these things are still in early stages of
development, or if I'm simply missing things.

Thanks.
Mike.
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: Unicode in Ruby and a Ruby Reference"

|1. I was wondering what the state is of Ruby and support for Unicode?
| For instance, I'm coming mostly from Python which has a special
|Unicode type that can be translated to various encodings on request.
|I can't seem to find anything similar in Ruby. Does it exist
|anywhere, or is it standard to deal with Unicode in a completely
|different way, or is it something that hasn't been developed at this
|point?

Handing Unicode (UTF-8) is OK. Ruby's strings can contain any
sequence of bytes. Regex engine is aware of UTF-8 so that you can
use pattern match against Unicode characters. For encoding
conversion, iconv library is your friend.

This is weaker than Python, but does most of the jobs. We are working
on M17N Ruby (M17N stands for multilingualization), in which you can
handle many encodings (e.g. UTF-8, UTF-16, Big5, GBK, and much more)
without conversion.

matz.
 
G

Gavin Sinclair

2. What are the most definitive references for Ruby and the standard
libraries that are available? I've found the reference at RubyCentral
to be very helpful (http://www.rubycentral.com/ref/), but it also
seems to be missing things here and there. On the other hand, it's
possible that I'm completely mis-reading it.

What you are reading online is "Programming Ruby, 1ed", a book by Dave
Thomas and Andy Hunt. The second edition hit the shelves recently but
there's no online version. It's a purchase you won't regret, and it
describes all the standard libraries by example, and all the builtin
classes in detail (up to date with the latest Ruby).

Information about the standard library is also housed at

http://ruby-doc.org/stdlib

Cheers,
Gavin
 
F

Florian Gross

Yukihiro said:
In message "Re: Unicode in Ruby and a Ruby Reference"

|1. I was wondering what the state is of Ruby and support for Unicode?
| For instance, I'm coming mostly from Python which has a special
|Unicode type that can be translated to various encodings on request.
|I can't seem to find anything similar in Ruby. Does it exist
|anywhere, or is it standard to deal with Unicode in a completely
|different way, or is it something that hasn't been developed at this
|point?

Handing Unicode (UTF-8) is OK. Ruby's strings can contain any
sequence of bytes. Regex engine is aware of UTF-8 so that you can
use pattern match against Unicode characters. For encoding
conversion, iconv library is your friend.

However I think that this awareness is just where a code point begins
and ends. This might have changed with Onigurama, but "Ä"[/ä/i] used to
return nil.
 
G

Giulio Piancastelli

How a literal Unicode character can be inserted in a Ruby String? I
recall Java having the \uNNNN escaping, for example, but I wasn't able
to find a similar mechanism for Ruby. (On the other hand, I'm aware of
escaping for octal and hex character codes, e.g. \NNN and \xNN.)
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: Unicode in Ruby and a Ruby Reference"

|However I think that this awareness is just where a code point begins
|and ends. This might have changed with Onigurama, but "Ä"[/ä/i] used to
|return nil.

Onigurama should aware of it, although I found a bug there.
I will fix soon. Thank you.

matz.
 
A

Austin Ziegler

How a literal Unicode character can be inserted in a Ruby String? I
recall Java having the \uNNNN escaping, for example, but I wasn't able
to find a similar mechanism for Ruby. (On the other hand, I'm aware of
escaping for octal and hex character codes, e.g. \NNN and \xNN.)

\u4321 is a UTF-16BE encoding, so you would need to know the
equivalent UTF-8 encoding, e.g., \xe4\x8c\xa1.

-austin
 
M

Mohammad Khan

What you are reading online is "Programming Ruby, 1ed", a book by Dave
Thomas and Andy Hunt. The second edition hit the shelves recently but
there's no online version. It's a purchase you won't regret, and it
describes all the standard libraries by example, and all the builtin
classes in detail (up to date with the latest Ruby).

Information about the standard library is also housed at

http://ruby-doc.org/stdlib

Cheers,
Gavin

You can also buy the PDF version of this book from:
http://pragmaticprogrammer.com/shopsite_sc/store/html/index.html
which will cost $25.00, I think.

Thanks,
MOhammad
 
M

Mike McGavin

Hi again.

I'm making my second big attempt in getting into Ruby, and I have a
couple of questions.
[--snip--]

I just wanted to say thanks for all of the feedback from everyone
following my questions about the Ruby reference documentation and the
unicode questions. It's been very helpful, and I'll continue to
monitor the thread.

Thanks.
Mike.
 
A

Alexander Kellett

just fyi:
http://www.rubycentral.com/book/lib_patterns.html
the patterns and standard lib sections contain some pretty nifty stuff.
also. "ri" totally rocks :)
Alex

Hi again.

I'm making my second big attempt in getting into Ruby, and I have a
couple of questions.
[--snip--]

I just wanted to say thanks for all of the feedback from everyone
following my questions about the Ruby reference documentation and the
unicode questions. It's been very helpful, and I'll continue to
monitor the thread.

Thanks.
Mike.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,817
Latest member
AdalbertoT

Latest Threads

Top