Why doesn't join() call str() on its arguments?

M

Michael Hoffman

Nick said:
Why not have another method to do this? I propose joinany which will
join any type of object together, not just strings

I think that's what Frederik was proposing. Except that it would be
called join and be a built-in (not a str method).
 
M

Michael Hoffman

Max said:
'1,2,3,four'

Is this really that hard to do, that you want it in the library?

I think it's a sufficiently common use case that having to do that
is a wart.
 
N

Nick Coghlan

Leo said:
All I've been able to find is a 1999 python-dev post by Tim
Peters which would seem to indicate he doesn't understand it
either:

"string.join(seq) doesn't currently convert seq elements to
string type, and in my vision it would. At least three of us
admit to mapping str across seq anyway before calling
string.join, and I think it would be a nice convenience
[...]"

But now it's 2005, and both string.join() and str.join() still
explicitly expect a sequence of strings rather than a sequence of
stringifiable objects.

There's a more recent discussion than that, because I tried to change it shortly
after I offered a patch to fix a corner case for string subclasses.

This seems to be the last relevant message in the thread:
http://mail.python.org/pipermail/python-dev/2004-August/048516.html

So it was tried, but we found too many weird corner cases we weren't quite sure
what to do with. At that point, "explicit is better than implicit" kicked in :)

A shame, since it was both faster and more convenient than using a list comp.
But the convenience wasn't worth the ambiguity of the semantics.

Cheers,
Nick.

P.S. For anyone else that uses Firefox:

Linking the pydev keyword to
"http://www.google.com/search?q=site:mail.python.org+inurl:python-dev+%s"

and the pylist keyword to
"http://www.google.com/search?q=site:mail.python.org+inurl:python-list+%s"

makes searching the archives on python.org really easy. Of course, knowing what
you're looking for because you were a participant in the discussion helps, too ;)
 
L

Leif K-Brooks

Leo said:
What I can't find an explanation for is why str.join() doesn't
automatically call str() on its arguments

I don't really like that idea for the reasons others have stated. But a
related and (IMHO) more Pythonic idea would be to allow arbitrary
objects to be str.join()ed if they use __radd__ to allow concatenation
with strings. This would be consistent with how the + operator behaves:

Python 2.4 (#2, Jan 8 2005, 20:18:03)
[GCC 3.3.5 (Debian 1:3.3.5-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information..... def __radd__(self, other):
.... if isinstance(other, basestring):
.... return other + str(self)
.... def __str__(self):
.... return 'Foo()'
....
>>> 'foo:' + Foo() 'foo:Foo()'
>>> ''.join(['foo', Foo()])
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: sequence item 1: expected string, Foo found
 
N

Nick Vargish

news.sydney.pipenetworks.com said:
Really ? Then why are you using python.

Try "import this" at a Python prompt. I didn't invent "Explicit is
better than implicit."
Python or most dynamic languages are are so great because of their
common sense towards the "implicit".

Python is not "most dynamic languages", and does not seem to
implicitly "cast" objects into other types. Python may be "dynamic",
but it's also "strongly typed", a feature I consider a benefit, though
you are of course free to disagree.
c = "%s%s" % (a, b)
There is an implicit str(b) here.

Not if you read the docs, as another poster has pointed out.
''.join(["string", 2]) to me is no different then the example above.

TypeError: sequence item 1: expected string, int found

Which pretty much supports my initial argument -- if a non-string got
into the list, something needs to be fixed, and it isn't the behavior
of the join() method!

Nick
 
N

news.sydney.pipenetworks.com

Fredrik said:
:




a certain "princess bride" quote would fit here, I think.

I'm not really familiar with it, can you enlighten please.
nope. it's explicit: %s means "convert using str()".

ok you got me there, although it must be bad practice compared to

c = "%s%d" % (a, b)

because this is much more explicit and will tell you if b is ever
anything other then an integer even though you may not care.
from the documentation:

%s String (converts any python object using str()).

''.join(["string", 2]) to me is no different then the example above.


so where's the "%s" in your second example?

</F>

I'm not sure if this has been raised in the thread but I sure as heck
always convert my join arguments using str(). When does someone use
..join() and not want all arguments to be strings ? Any examples ?

Regards,

Huy
 
N

news.sydney.pipenetworks.com

Fredrik said:
:




a certain "princess bride" quote would fit here, I think.

I'm not really familiar with it, can you enlighten please.
nope. it's explicit: %s means "convert using str()".

ok you got me there, although it must be bad practice compared to

c = "%s%d" % (a, b)

because this is much more explicit and will tell you if b is ever
anything other then an integer even though you may not care.
from the documentation:

%s String (converts any python object using str()).

''.join(["string", 2]) to me is no different then the example above.


so where's the "%s" in your second example?

</F>

I'm not sure if this has been raised in the thread but I sure as heck
always convert my join arguments using str(). When does someone use
..join() and not want all arguments to be strings ? Any examples ?

Regards,

Huy
 
N

news.sydney.pipenetworks.com

Nick said:
Try "import this" at a Python prompt. I didn't invent "Explicit is
better than implicit."

Thanks for the pointer. Let's see how many zen points are for the OP's
idea vs against

Against
Explicit is better than implicit.
Special cases aren't special enough to break the rules.

On the wall
Errors should never pass silently.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
Namespaces are one honking great idea -- let's do more of those!

For
Beautiful is better than ugly.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Although practicality beats purity. Unless explicitly silenced.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.

Well this is clearly a goer ;-)
Python or most dynamic languages are are so great because of their
common sense towards the "implicit".


Python is not "most dynamic languages", and does not seem to
implicitly "cast" objects into other types. Python may be "dynamic",
but it's also "strongly typed", a feature I consider a benefit, though
you are of course free to disagree.

c = "%s%s" % (a, b)
There is an implicit str(b) here.


Not if you read the docs, as another poster has pointed out.
''.join(["string", 2]) to me is no different then the example above.


TypeError: sequence item 1: expected string, int found

Which pretty much supports my initial argument -- if a non-string got
into the list, something needs to be fixed, and it isn't the behavior
of the join() method!

Nick
 
N

Nick Coghlan

news.sydney.pipenetworks.com said:
I'm not sure if this has been raised in the thread but I sure as heck
always convert my join arguments using str(). When does someone use
.join() and not want all arguments to be strings ? Any examples ?

When the list argument already contains only strings, conversion is redundant.

Anyway, automatic conversion of the argument list elements *was* tried around
August of last year, despite some concerns about it being too magical. However,
the interaction between string, unicode, subclasses of same, __str__, __repr__,
__unicode__ and everything else made it impossible to come up with behaviour
that was clearly 'better' than the status quo (every idea we considered ended up
resulting in quirky behaviour at some point), so things never progressed to a
formal patch.

The explicit use of map() or LC was kept as the least bad of the available options.

Cheers,
Nick.
 
S

Sion Arrowsmith

!

Nick Vargish said:
If a non-string-type has managed to
get into my list-of-strings, then something has gone wrong and I would
like to know about this potential problem.

Thinking about where I use join(), I agree. If there's something
other than a string in my list, either I know about it and can
explicitly convert it ("Explicit is better than implicit.") or
it's an error, and "Errors should never pass silently."
 
N

Nick Coghlan

news.sydney.pipenetworks.com said:
Thanks for the pointer. Let's see how many zen points are for the OP's
idea vs against

Against
Explicit is better than implicit.
Special cases aren't special enough to break the rules.

On the wall
Errors should never pass silently.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
Namespaces are one honking great idea -- let's do more of those!

For
Beautiful is better than ugly.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Although practicality beats purity. Unless explicitly silenced.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.

From the point of view of someone who actually *tried* doing this to the
relevant method, while maintaining backward compatibility. . .

Against:

Explicit is better than implicit
Special cases aren't special enough to break the rules

In the face of ambiguity, refuse the temptation to guess
- there are plenty of ambiguious cases to handle
Simple is better than complex (Semantics & C code)
- and much complexity in dealing with the ambiguities
Errors should never pass silently. Unless explicitly silenced.
- and a bunch of them are likely to indicate errors. Maybe.
Readability counts (C code)
- This makes the implemenation hard to read
If the implementation is hard to explain, it's a bad idea.
- or explain to anyone, too

For:

Beautiful is better than ugly.
- the explicit calls to str() aren't that clean.

There should be one-- and preferably only one --obvious way to do it.
- Currently listcomp, genexp, map, string interpolation, etc

IMO, the rest can be used to argue either side, or simply don't weigh heavily
either way.

The things that make it a real cow are the current behaviour of auto-promotion
to the Unicode version when there are any Unicode strings in the list, and the
existence of __str__ and __repr__ methods which actually return instances of
unicode rather than str.

Calling str() explicitly makes the different behaviours unsurprising. Trying to
do the same thing behind the scenes has the potential to make the method behave
*very* suprisingly depending on the objects involved.

Cheers,
Nick.
 
D

Duncan Booth

news.sydney.pipenetworks.com said:
I'm not sure if this has been raised in the thread but I sure as heck
always convert my join arguments using str(). When does someone use
.join() and not want all arguments to be strings ? Any examples ?

This has already been raised, but maybe not in exactly this context. You
don't want to convert your arguments to be strings if some of them are
unicode.

If I do:

res = str.join(', ', [a, b, c])

then res is of type str if, and only if, a, b, and c are of type str.
If any of a, b, or c are of type unicode, then res is unicode.

In effect, this means you don't have to worry about whether you are
manipulating str or unicode at this point in your program (kind of
comparable to not caring whether the integer you are using is int or long).

When you come to output the string you do need to care, as when it is
unicode you may have to encode it, but at least the internal manipulations
can ignore this possibility.

Of course you could just call unicode on everything but for simple
applications you might not want to handle unicode at all. That's not a
decision Python can make for you so it shouldn't guess.
 
J

Jeremy Bowers

Thanks for the pointer. Let's see how many zen points are for the OP's
idea vs against

Along with the fact that I agree with Nick that you've seriously
miscounted (most of your "fors" are simply irrelevant and I think you
added them to bolster your point, at least I *hope* you don't think they
are all relevant... for instance if you really think "Flat is better than
nested" applies here, you don't understand what that one is saying...),
I'd point out that the Zen that can be comprehended by checking off items
in a list is not the true Zen.
 
J

Jeff Shannon

news.sydney.pipenetworks.com said:
I'm not really familiar with it, can you enlighten please.

(Taking a guess at which quote /F had in mind...)


Vezzini: "Inconceivable!"
Inigo: "You keep using that word. I do not think that it means
what you think it means."

Jeff Shannon
 
T

Terry Reedy

Nick Coghlan said:
This seems to be the last relevant message in the thread:
http://mail.python.org/pipermail/python-dev/2004-August/048516.html
So it was tried, but we found too many weird corner cases we weren't
quite sure
what to do with. At that point, "explicit is better than implicit" kicked
in :)
A shame, since it was both faster and more convenient than using a list
comp. But the convenience wasn't worth the ambiguity of the semantics.

Your experience, where you made an honest go of implementation, and the
error catching argument, convince me that explicit is better for this case.

I was thinking, 'Well, print autoconverts' (as documented). But it
requires that each item be listed. And if there is any ambiguity, no harm
since it is only meant for quick convenience anyway and not exact
char-by-char control. Join, on the other hand, often gets a pre-existing
list. If that is known (thought to be) all strings, then mapping str is a
waste as well as an error mask. In not, map(str,...) is only 9 extra
chars.

Terry J. Reedy
 
D

Dave Benjamin

Jeremy said:
I'd point out that the Zen that can be comprehended by checking off items
in a list is not the true Zen.

The Zen that can be imported is not the eternal Zen. =)
 
N

news.sydney.pipenetworks.com

Duncan said:
news.sydney.pipenetworks.com wrote:

I'm not sure if this has been raised in the thread but I sure as heck
always convert my join arguments using str(). When does someone use
.join() and not want all arguments to be strings ? Any examples ?


This has already been raised, but maybe not in exactly this context. You
don't want to convert your arguments to be strings if some of them are
unicode.

If I do:

res = str.join(', ', [a, b, c])

then res is of type str if, and only if, a, b, and c are of type str.
If any of a, b, or c are of type unicode, then res is unicode.

In effect, this means you don't have to worry about whether you are
manipulating str or unicode at this point in your program (kind of
comparable to not caring whether the integer you are using is int or long).

When you come to output the string you do need to care, as when it is
unicode you may have to encode it, but at least the internal manipulations
can ignore this possibility.

Of course you could just call unicode on everything but for simple
applications you might not want to handle unicode at all. That's not a
decision Python can make for you so it shouldn't guess.

I see your point but I'm not totally convinced I don't understand
unicode that well so I'll just be quiet now.

Your point about int and long vs str and unicode is interesting though.
Does it mean str and unicode will some time in the future be unified
once all the differences are sorted out ?

Regards,

Huy
 
N

news.sydney.pipenetworks.com

Jeremy said:
Along with the fact that I agree with Nick that you've seriously
miscounted (most of your "fors" are simply irrelevant and I think you
added them to bolster your point, at least I *hope* you don't think they
are all relevant... for instance if you really think "Flat is better than
nested" applies here, you don't understand what that one is saying...),

You're right there. It's my own interpretation :)
I'd point out that the Zen that can be comprehended by checking off items
in a list is not the true Zen.

Well I didn't create or bring up the list of items originally. I was
"zenning" it out until someone pointed me to the "Python" commandments.

I always wished computer science was more engineering then philosophy.
That way there'd always be an obvious answer.

Regards,

Huy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,219
Messages
2,571,127
Members
47,744
Latest member
FrederickM

Latest Threads

Top