G
Gavin Sinclair
There has been some discussion on this topic in the thread "deciding
between ruby and python". It was centered around how far #join should
go in getting what it wants (a String) from an object. I detected
some misunderstanding at the intention of #to_s and #to_str, so I hope
the following is helpful (and more importantly, correct).
#to_s is for providing a String *representation* when the caller
explicitly wants a string, e.g. for rendering purposes. For example,
Integer#to_s gives you a nice string representation of an integer,
which is pretty straightforward.
#to_str is for providing a String *version* when the caller
*implicity* wants a string, e.g. parameter passing. The motivation
here is typically for the class to fit in with the standard Ruby
classes. For example, the Pathname class (in stdlib) represents a
path to a file and has all sorts of funky path manipulation
capabilities. But since a path **is essentially a String** it makes
sense to provide Pathname#to_str so the following lines work:
File.open(pathname)
message = "Can't find directory: " + pathname
Naturally, Pathname will also provide #to_s, and it will probably be
the same as #to_str.
However, Integer does not provide #to_str, because despite there being
a String *representation* of an integer, there is no String *version*
of an integer. If Integer#to_str were provided, then the following
would "work":
"1" + 7 # "17" (yuck!)
And if you *really* want Perl, implement String#to_int so you can do
1 + "7" # 8 (YUCK!!!)
A relevant point should be made here about #to_s. It provides *a*
string representation of an object. There is no such thing as *the*
string representation of an object. Even the humble integer, whose
string rep is pretty damn obvious, is up for discussion. Do we want
-14.to_s # "-14"
or
-14.to_s # "(14)"
or even
-14.to_s # "<span class="negative-number">(14)</span>"
?
Because of these different demands in different contexts, class
authors should be mindful of providing a #to_s implementation that is
not "obvious" and clearly desirable in a majority of cases.
I remember redefining #to_s at an object level, not a class level,
because its output was too long. It was conceptually correct in what
it was trying to provide, but for my purposes (unit testing) it was
simply too long.
REXML could do with some work on its #to_s methods. Take a large
XML document, select a (small) node within it and call #to_s, and go
make yourself a coffee while it prints reams of stuff to the screen
Cheers,
Gavin
between ruby and python". It was centered around how far #join should
go in getting what it wants (a String) from an object. I detected
some misunderstanding at the intention of #to_s and #to_str, so I hope
the following is helpful (and more importantly, correct).
#to_s is for providing a String *representation* when the caller
explicitly wants a string, e.g. for rendering purposes. For example,
Integer#to_s gives you a nice string representation of an integer,
which is pretty straightforward.
#to_str is for providing a String *version* when the caller
*implicity* wants a string, e.g. parameter passing. The motivation
here is typically for the class to fit in with the standard Ruby
classes. For example, the Pathname class (in stdlib) represents a
path to a file and has all sorts of funky path manipulation
capabilities. But since a path **is essentially a String** it makes
sense to provide Pathname#to_str so the following lines work:
File.open(pathname)
message = "Can't find directory: " + pathname
Naturally, Pathname will also provide #to_s, and it will probably be
the same as #to_str.
However, Integer does not provide #to_str, because despite there being
a String *representation* of an integer, there is no String *version*
of an integer. If Integer#to_str were provided, then the following
would "work":
"1" + 7 # "17" (yuck!)
And if you *really* want Perl, implement String#to_int so you can do
1 + "7" # 8 (YUCK!!!)
A relevant point should be made here about #to_s. It provides *a*
string representation of an object. There is no such thing as *the*
string representation of an object. Even the humble integer, whose
string rep is pretty damn obvious, is up for discussion. Do we want
-14.to_s # "-14"
or
-14.to_s # "(14)"
or even
-14.to_s # "<span class="negative-number">(14)</span>"
?
Because of these different demands in different contexts, class
authors should be mindful of providing a #to_s implementation that is
not "obvious" and clearly desirable in a majority of cases.
I remember redefining #to_s at an object level, not a class level,
because its output was too long. It was conceptually correct in what
it was trying to provide, but for my purposes (unit testing) it was
simply too long.
REXML could do with some work on its #to_s methods. Take a large
XML document, select a (small) node within it and call #to_s, and go
make yourself a coffee while it prints reams of stuff to the screen
Cheers,
Gavin