Differences creating tuples and collections.namedtuples

S

Steven D'Aprano

On Mon, 18 Feb 2013 23:48:46 -0800, raymond.hettinger wrote:

[...]
If your starting point is an existing iterable such as s=['Guido',
'BDFL', 1], you have a couple of choices: p=Person(*s) or
p=Person._make(s). The latter form was put it to help avoid unpacking
and repacking the arguments.


It might not be obvious to the casual reader, but despite the leading
underscore, _make is part of the public API for namedtuple:

http://docs.python.org/2/library/collections.html#collections.namedtuple
 
G

Gregory Ewing

Steven said:
py> class MyDict(dict):
... @classmethod
... def fromkeys(cls, func):
... # Expects a callback function that gets called with no arguments
... # and returns two items, a list of keys and a default value.
... return super(MyDict, cls).fromkeys(*func())

Here you've overridden a method with one having a
different signature. That's not something you'd
normally do, because, being a method, it's likely
to get invoked polymorphically.

Constructors, on the other hand, are usually *not*
invoked polymorphically. Most of the time we know
exactly which constructor we're calling, because we
write the class name explicitly at the point of call.

Consequently, we have a different attitude when it
comes to constructors. We choose not to require LSP
for constructors, because it turns out to be very
useful not to be bound by that constraint.
Practicality beats purity here.

The reason IPython gets into trouble is that it tries
to make a polymorphic call to something that nobody
expects to need to be polymorphic.
 
S

Steven D'Aprano

Pardon me for the double-post, if any, my news client appears to have
eaten my first reply.

On Mon, 18 Feb 2013 23:48:46 -0800, raymond.hettinger wrote:

[...]
If your starting point is an existing iterable such as s=['Guido',
'BDFL', 1], you have a couple of choices: p=Person(*s) or
p=Person._make(s). The latter form was put it to help avoid unpacking
and repacking the arguments.


It might not be obvious to the casual reader, but despite the leading
underscore, _make is part of the public API for namedtuple:

http://docs.python.org/2/library/collections.html#collections.namedtuple
 
J

John Reid

Terry said:
This is a mistake in the following two senses. First, tuple is a class
with instances while namedtuple is a class factory that produces
classes. (One could think of namedtuple as a metaclass, but it was not
implemented that way.)


I think you have misunderstood. I don't believe that John wants to use the
namedtuple factory instead of tuple. He wants to use a namedtuple type
instead of tuple.

That is, given:

Point3D = namedtuple('Point3D', 'x y z')

he wants to use a Point3D instead of a tuple. Since:

issubclass(Point3D, tuple)

holds true, the Liskov Substitution Principle (LSP) tells us that anything
that is true for a tuple should also be true for a Point3D. That is, given
that instance x might be either a builtin tuple or a Point3D, all of the
following hold:

- isinstance(x, tuple) returns True
- len(x) returns the length of x
- hash(x) returns the hash of x
- x returns item i of x, or raises IndexError
- del x raises TypeError
- x + a_tuple returns a new tuple
- x.count(y) returns the number of items equal to y

etc. Basically, any code expecting a tuple should continue to work if you
pass it a Point3D instead (or any other namedtuple).

There is one conspicuous exception to this: the constructor:

type(x)(args)

behaves differently depending on whether x is a builtin tuple, or a Point3D.


Exactly and thank you Steven for explaining it much more clearly.
 
J

John Reid

One quick workaround would be to use a tuple where required and then
coerce it back to Result when needed as such:

def sleep(secs):
import os, time, parallel_helper
start = time.time()
time.sleep(secs)
return tuple(parallel_helper.Result(os.getpid(), time.time() -
start))

rc = parallel.Client()
v = rc.load_balanced_view()
async_result = v.map_async(sleep, range(3, 0, -1), ordered=False)
for ar in async_result:
print parallel_helper.Result(*ar)

You can of course skip the creation of Result in sleep and only turn
it into one in the display loop, but it all depends on additional
requirements (and adds some clarity to what is happening, I think).

Thanks all I really need is a quick work around but it is always nice to
discuss these things. Also this class decorator seems to do the job for
ipython although it does change the construction syntax a little and is
probablty overkill. No doubt the readers of this list can improve it
somewhat as well.


import logging
_logger = logging.getLogger(__name__)
from collections import namedtuple

def make_ipython_friendly(namedtuple_class):
"""A class decorator to make namedtuples more ipython friendly.
"""
_logger.debug('Making %s ipython friendly.', namedtuple_class.__name__)

# Preserve original new to use if needed with keyword arguments
original_new = namedtuple_class.__new__

def __new__(cls, *args, **kwds):
_logger.debug('In decorator __new__, cls=%s', cls)
if args:
if kwds:
raise TypeError('Cannot construct %s from an positional
and keyword arguments.', namedtuple_class.__name__)
_logger.debug('Assuming construction from an iterable.')
return namedtuple_class._make(*args)
else:
_logger.debug('Assuming construction from keyword arguments.')
return original_new(namedtuple_class, **kwds)

namedtuple_class.__new__ = staticmethod(__new__) # set the class'
__new__ to the new one
del namedtuple_class.__getnewargs__ # get rid of getnewargs

return namedtuple_class

Result = make_ipython_friendly(namedtuple('Result', 'pid duration'))
 
T

Terry Reedy

I think you have misunderstood.

Wrong, which should be evident to anyone who reads the entire paragraph
as the complete thought exposition it was meant to be. Beside which,
this negative ad hominem comment is irrelevant to the rest of your post
about the Liskov Substitution Principle.

The rest of the paragraph, in two more pieces:

In other words, neither the namedtuple object nor any namedtuple class
object can fully substitute for the tuple class object. Nor can
instances of any namedtuple class fully substitute for instances of the
tuple class. Therefore, I claim, the hope that "namedtuples could be
used as replacements for tuples in all instances" is a futile hope,
however one interprets that hope.

Part of the effect is independent of initialization. Even if namedtuples
were initialized by iterator, there would still be glitches. In
particular, even if John's named tuple class B *could* be initialized as
B((1,2,3)), it still could not be substituted for t in the code below.
t = (1,2,3)
type(t) is type(t[1:]) True
type(t)(t[1:])
(2, 3)

As far as read access goes, B effectively is a tuple. As soon as one
uses type() directly or indirectly (by creating new objects), there may
be problems. That is because the addition of field names *also* adds a
length constraint, which is a subtraction of flexibility.

---
Liskov Substitution Principle (LSP): I met this over 15 years ago
reading debates among OOP enthusiasts about whether Rectangle should be
a subclass of Square or Square a subclass of Rectangle, and similarly,
whether Ostrich can be a legitimate subclass of Bird.

The problem I see with the LSP for modeling either abstract or concrete
entities is that we in fact do define subclasses by subtraction or
limitation, as well as by augmentation, while the LSP only allows the
latter.

On answer to the conundrums above to to add Parallelepiped as a
superclass for both Square and Rectangle and Flying_bird as an
additional subclass of Bird. But then the question becomes: Does obeying
the LSP count as 'necessity' when one is trying to follow Ockham's
principle of not multiplying classes without necessity?
 
C

Chris Angelico

Liskov Substitution Principle (LSP): I met this over 15 years ago reading
debates among OOP enthusiasts about whether Rectangle should be a subclass
of Square or Square a subclass of Rectangle, and similarly, whether Ostrich
can be a legitimate subclass of Bird.

The problem I see with the LSP for modeling either abstract or concrete
entities is that we in fact do define subclasses by subtraction or
limitation, as well as by augmentation, while the LSP only allows the
latter.

A plausible compromise is to demand LSP in terms of programming, but
not necessarily functionality. So an Ostrich would have a fly() method
that returns some kind of failure, in the same way that any instance
of any flying-bird could have injury or exhaustion that prevents it
from flying. It still makes sense to attempt to fly - an ostrich IS a
bird - but it just won't succeed.

ChrisA
 
R

Roy Smith

Chris Angelico said:
A plausible compromise is to demand LSP in terms of programming, but
not necessarily functionality. So an Ostrich would have a fly() method
that returns some kind of failure, in the same way that any instance
of any flying-bird could have injury or exhaustion that prevents it
from flying. It still makes sense to attempt to fly - an ostrich IS a
bird - but it just won't succeed.

ChrisA

I would think Ostrich.fly() should raise NotImplementedError. Whether
this violates LSP or not, depends on how you define Bird.

class Bird:
def fly(self):
"""Commence aviation.

Note: not all birds can fly. Calling fly() on
a flightless bird will raise NotImplementedError.

"""

class Ostrich(Bird):
def fly(self):
raise NotImplementedError("ostriches can't fly")

class Sheep(Bird):
def fly(self):
self.plummet()
 
S

Steven D'Aprano

Wrong, which should be evident to anyone who reads the entire paragraph
as the complete thought exposition it was meant to be. Beside which,
this negative ad hominem comment is irrelevant to the rest of your post
about the Liskov Substitution Principle.

Terry, I'm sorry that I have stood on your toes here, no offense was
intended. It seemed to me, based on the entire paragraph that you wrote,
that you may have misunderstood the OP's question. The difference in
signatures between the namedtuple class factory and tuple is irrelevant,
as I can now see you understand, but by raising it in the first place
you gave me the impression that you may have misunderstood what the OP
was attempting to do.

The rest of the paragraph, in two more pieces:


In other words, neither the namedtuple object nor any namedtuple class
object can fully substitute for the tuple class object. Nor can
instances of any namedtuple class fully substitute for instances of the
tuple class. Therefore, I claim, the hope that "namedtuples could be
used as replacements for tuples in all instances" is a futile hope,
however one interprets that hope.

I did discuss the fixed length issue directly, and agreed with you that
if your contract is to construct variable-length tuples, then a
fixed-length namedtuple is not substitutable.

But in practice, one common use-case for tuples (whether named or not)
is for fixed-length records, and in that use-case, a namedtuple of
length N should be substitutable for a tuple of length N.

Part of the effect is independent of initialization. Even if namedtuples
were initialized by iterator, there would still be glitches. In
particular, even if John's named tuple class B *could* be initialized as
B((1,2,3)), it still could not be substituted for t in the code below.
t = (1,2,3)
type(t) is type(t[1:])
True

Agreed. There are other differences as well, e.g. repr(t) will differ
between builtin tuples and namedtuples. The only type which is identical
in every conceivable aspect to tuple is tuple itself. Any subclass or
subtype[1] must by definition differ in at least one aspect from tuple:

type(some_tuple) is type(())

and in practice will differ in other aspects as well.


Footnote: [1] Subclass meaning it inherits from tuple; subtype in the
sense that it duck-types as a tuple, but may or may not share any
implementation.


LSP cannot be interpreted in isolation. Any non-trivial modification of
a class will change *something* about the class, after all that's why we
subclassed it in the first place. Either the interface will be
different, or the semantics will be different, or both. LSP must always
be interpreted in the intersection between the promises made by the
class and the promises your application cares about.

Some promises are more important than others, hence some violations are
more serious than others. For instance, I think that tuple indexing is a
critical promise: a "tuple" that cannot be indexed is surely not a tuple.
The exact form of the repr() of a tuple is generally not important at
all: a tuple that prints as MyBunchOStuff(...) is still a tuple. In my
experience, the constructor signature is of moderate importance. But of
course that depends on what promises you rely on, if you are relying on
the tuple constructor, then it is critical *to you*.

The problem I see with the LSP for modeling either abstract or concrete
entities is that we in fact do define subclasses by subtraction or
limitation, as well as by augmentation, while the LSP only allows the
latter.

People do all sorts of things. They write code that is O(N**2) or worse,
they call eval() on untrusted data, they use isinstance() and break
duck-typing, etc. That they break LSP does not necessarily mean that
they should. LSP is one of the five fundamental best-practices for
object-oriented code, "SOLID":

http://en.wikipedia.org/wiki/SOLID_(object-oriented_design)

Breaking any of the SOLID principles is a code-smell. That does not mean
that there is never a good reason to do so, but SOLID is a set of
principle which have stood the test of time and practice. Any code that
breaks one of those principles should be should be considered smelly, or
worse, until justified.

(And for the avoidance of doubt, I am more than satisfied with the
justification given for the difference in signature between tuples and
namedtuples.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,141
Messages
2,570,817
Members
47,367
Latest member
mahdiharooniir

Latest Threads

Top