duck-type-checking?

J

Joe Strout

Let me preface this by saying that I think I "get" the concept of duck-
typing.

However, I still want to sprinkle my code with assertions that, for
example, my parameters are what they're supposed to be -- too often I
mistakenly pass in something I didn't intend, and when that happens, I
want the code to fail as early as possible, so I have the shortest
possible path to track down the real bug. Also, a sufficiently clever
IDE could use my assertions to know the type of my identifiers, and so
support me better with autocompletion and method tips.

So I need functions to assert that a given identifier quacks like a
string, or a number, or a sequence, or a mutable sequence, or a
certain class, or so on. (On the class check: I know about
isinstance, but that's contrary to duck-typing -- what I would want
that check to do instead is verify that whatever object I have, it has
the same public (non-underscore) methods as the class I'm claiming.)

Are there any standard methods or idioms for doing that?

Thanks,
- Joe
 
J

J Kenneth King

Joe Strout said:
Let me preface this by saying that I think I "get" the concept of
duck-
typing.

However, I still want to sprinkle my code with assertions that, for
example, my parameters are what they're supposed to be -- too often I
mistakenly pass in something I didn't intend, and when that happens, I
want the code to fail as early as possible, so I have the shortest
possible path to track down the real bug. Also, a sufficiently clever
IDE could use my assertions to know the type of my identifiers, and so
support me better with autocompletion and method tips.

So I need functions to assert that a given identifier quacks like a
string, or a number, or a sequence, or a mutable sequence, or a
certain class, or so on. (On the class check: I know about
isinstance, but that's contrary to duck-typing -- what I would want
that check to do instead is verify that whatever object I have, it has
the same public (non-underscore) methods as the class I'm claiming.)

Are there any standard methods or idioms for doing that?

Thanks,
- Joe

I generally use the 'assert' keyword when I'm in a rush otherwise unit
tests generally catch this kind of thing.
 
M

Matimus

Let me preface this by saying that I think I "get" the concept of duck-
typing.

However, I still want to sprinkle my code with assertions that, for  
example, my parameters are what they're supposed to be -- too often I  
mistakenly pass in something I didn't intend, and when that happens, I  
want the code to fail as early as possible, so I have the shortest  
possible path to track down the real bug.  Also, a sufficiently clever  
IDE could use my assertions to know the type of my identifiers, and so  
support me better with autocompletion and method tips.

So I need functions to assert that a given identifier quacks like a  
string, or a number, or a sequence, or a mutable sequence, or a  
certain class, or so on.  (On the class check: I know about  
isinstance, but that's contrary to duck-typing -- what I would want  
that check to do instead is verify that whatever object I have, it has  
the same public (non-underscore) methods as the class I'm claiming.)

Are there any standard methods or idioms for doing that?

Thanks,
- Joe

The only thing that comes to mind is to use explicit checking
(isinstance). The new 'abc' module in 2.6 is worth a look. It seems
like this sort of thing might become a _little_ more popular.

I think duck-typing is great, but there are instances where you really
want the code to be better at documenting your intention about what
interface you expect an object to have, and also to catch the problems
that it might cause early and in a place where you might actually be
able to catch a meaningful exception. I've been looking and haven't
found any clear rules here. It is a trade off. The more complex the
interface, the more you will likely want to do an explicit check. On
the other hand, the more complex the interface the more likely it is
that you are violating the 'Interface Segregation Principle'. That is,
you probably want to consider breaking the functionality down into
smaller interfaces, or even separate classes.

IMO explicit checking, like global variables or using a 'goto' in C,
are evil. That isn't to say that you should _never_ use them. Heck,
there are plenty of gotos in CPython. You want to minimize, and
clearly document the places where your code is evil. There are times
where it is necessary.

Matt
 
S

Steven D'Aprano

Let me preface this by saying that I think I "get" the concept of duck-
typing.

However, I still want to sprinkle my code with assertions that, for
example, my parameters are what they're supposed to be -- too often I
mistakenly pass in something I didn't intend, and when that happens, I
want the code to fail as early as possible, so I have the shortest
possible path to track down the real bug.


I'm surprised nobody has pointed you at Alex Martelli's recipe here:

http://code.activestate.com/recipes/52291/

While the recipe is great, it can be tiresome to apply all the time. I
would factor out the checks into a function, something like this:

def isstringlike(obj, methods=None):
"""Return True if obj is sufficiently string-like."""
if isinstance(obj, basestring):
return True
if methods is None:
methods = ['upper', 'lower', '__len__', '__getitem__']
for method in methods:
if not hasattr(obj, method):
return False
# To really be string-like, the following test should pass.
if len(obj) > 0:
s = obj[0]
if s[0] != s:
return False
return True
 
J

Joe Strout

I'm surprised nobody has pointed you at Alex Martelli's recipe here:

http://code.activestate.com/recipes/52291/

Thanks for that -- it's clever how he combines binding the methods
he'll use with doing the checking.
While the recipe is great, it can be tiresome to apply all the time. I
would factor out the checks into a function, something like this:

def isstringlike(obj, methods=None):
"""Return True if obj is sufficiently string-like."""
if isinstance(obj, basestring):
return True
if methods is None:
methods = ['upper', 'lower', '__len__', '__getitem__']
for method in methods:
if not hasattr(obj, method):
return False
# To really be string-like, the following test should pass.
if len(obj) > 0:
s = obj[0]
if s[0] != s:
return False
return True

Thanks for this, too; that's the sort of method I had in mind. That
last test for string-likeness is particularly clever. I'll need to
think more deeply about the implications.

Best,
- Joe
 
G

George Sakkis

While the recipe is great, it can be tiresome to apply all the time. I
would factor out the checks into a function, something like this:
def isstringlike(obj, methods=None):
   """Return True if obj is sufficiently string-like."""
   if isinstance(obj, basestring):
       return True
   if methods is None:
       methods = ['upper', 'lower', '__len__', '__getitem__']
   for method in methods:
       if not hasattr(obj, method):
           return False
   # To really be string-like, the following test should pass.
   if len(obj) > 0:
       s = obj[0]
       if s[0] != s:
           return False
   return True

Thanks for this, too; that's the sort of method I had in mind.  That  
last test for string-likeness is particularly clever.  I'll need to  
think more deeply about the implications.

To me this seems it combines the worst of both worlds: the
explicitness of LBYL with the heuristic nature of duck typing.. might
as well call it "doyoufeellucky typing". If you are going to Look
Before You Leap, try to stick to isinstance/issubclass checks
(especially in 2.6+ that they can be overriden) instead of crafting ad-
hoc rules of what makes an object be X-like.

George
 
S

Steven D'Aprano

While the recipe is great, it can be tiresome to apply all the time.
I would factor out the checks into a function, something like this:
def isstringlike(obj, methods=None):
   """Return True if obj is sufficiently string-like.""" if
   isinstance(obj, basestring):
       return True
   if methods is None:
       methods = ['upper', 'lower', '__len__', '__getitem__']
   for method in methods:
       if not hasattr(obj, method):
           return False
   # To really be string-like, the following test should pass. if
   len(obj) > 0:
       s = obj[0]
       if s[0] != s:
           return False
   return True

Thanks for this, too; that's the sort of method I had in mind.  That
last test for string-likeness is particularly clever.  I'll need to
think more deeply about the implications.

To me this seems it combines the worst of both worlds: the explicitness
of LBYL with the heuristic nature of duck typing.. might as well call it
"doyoufeellucky typing". If you are going to Look Before You Leap, try
to stick to isinstance/issubclass checks (especially in 2.6+ that they
can be overriden) instead of crafting ad- hoc rules of what makes an
object be X-like.

That's crazy talk. Duck-typing is, at it's very nature, ad-hoc. You want
something that is just duck-like enough for your application, without
caring if it is an honest-to-goodness duck. "Duck-like" depends on the
specific application, in fact the specific *function*. You can't get any
more ad-hoc than that.

What I posted, taken from Alex Martelli, is duck-typing. It's just that
the duck-type checks are performed before any other work is done. The
isinstance check at the start of the function was merely an optimization.
I didn't think I needed to say so explicitly, it should have been obvious.

Take this example:

def foo(alist):
alist.sort()
alist.append(5)


The argument can be any object with sort and append methods (assumed to
act in place). But what happens if you pass it an object with a sort
method but no append? The exception doesn't occur until *after* the
object is sorted, which leaves it in an inconsistent state. This can be
undesirable: you might need the function foo to be atomic, either the
entire function succeeds, or none of it.

Duck-typing is great, but sometimes "if it walks like a duck and quacks
like a duck it might as well be a duck" is not enough. Once you've built
an expensive gold-plated duck pond, you *don't* want your "duck" to sink
straight to the bottom of the pond and drown the first time you put it on
the water. You want to find out that it can swim like a duck *before*
building the pond.
 
G

George Sakkis

Take this example:

def foo(alist):
    alist.sort()
    alist.append(5)

The argument can be any object with sort and append methods (assumed to
act in place). But what happens if you pass it an object with a sort
method but no append? The exception doesn't occur until *after* the
object is sorted, which leaves it in an inconsistent state. This can be
undesirable: you might need the function foo to be atomic, either the
entire function succeeds, or none of it.

In this example atomicity is not guaranteed even if alist is a builtin
list (if it contains a complex number or other unorderable object),
let alone if not isistance(alist, list). It gets worse: false
positives are less likely for full-spelled methods with well-known
names such as "sort" and "append" (i.e. if hasattr(alist, 'append'),
alist.append *probably* does what you think it does), but good luck
with that when testing for __getitem__, __iter__ for more than one
pass, __call__, and other special methods with ambiguous or undefined
semantics.

Let's face it, duck typing is great for small to medium complexity
projects but it doesn't scale without additional support in the form
of ABCs/interfaces, explicit type checking (at compile and/or run
time), design by contract, etc. It must not be a coincidence that both
Zope and Twisted had to come up with interfaces to manage their
massive (for Python at least) complexity.

George
 
P

pruebauno

In this example atomicity is not guaranteed even if alist is a builtin
list (if it contains a complex number or other unorderable object),
let alone if not isistance(alist, list). It gets worse: false
positives are less likely for full-spelled methods with well-known
names such as "sort" and "append" (i.e. if hasattr(alist, 'append'),
alist.append *probably* does what you think it does), but good luck
with that when testing for __getitem__, __iter__ for more than one
pass, __call__, and other special methods with ambiguous or undefined
semantics.

Let's face it, duck typing is great for small to medium complexity
projects but it doesn't scale without additional support in the form
of ABCs/interfaces, explicit type checking (at compile and/or run
time), design by contract, etc. It must not be a coincidence that both
Zope and Twisted had to come up with interfaces to manage their
massive (for Python at least) complexity.

George

What would be actually interesting would be an switch to the python
interpreter that internally annotated function parameters with how
they are used in the function and raised an exception as soon as the
function is called instead of later. Failing earlier rather than
later. Example:


def sub(x,y):
....run some stuff
....print x[2]
....return y.strip().replace('a','b')

internally python generates:

def sub(x: must have getitem, y: must have strip and replace)



sub([1,2,3,4],5)
Error calling sub(x,y): y has to have strip() method.
 
J

Joe Strout

What would be actually interesting would be an switch to the python
interpreter that internally annotated function parameters with how
they are used in the function and raised an exception as soon as the
function is called instead of later. Failing earlier rather than
later.

That would be interesting, but it wouldn't have helped in the case I
had last week, where the method being called does little more than
stuff the argument into a container inside the class -- only to blow
up much later, when that data was accessed in a certain way.

The basic problem was that the data being stored was violating the
assumptions of the class itself. Sometimes in the past I've used a
"check invariants" method on a class with complex data, and call this
after mutating operations to ensure that all the class invariants are
still true. But this class wasn't really that complex; it's just that
it assumed all the stuff it's being fed were strings (or could be
treated as strings), and I inadvertently fed it an NLTK.Tree node
instead (not realizing that a library method I was calling could
return such a thing sometimes).

So, in this case, the simplest solution was to have the method that
initially accepts and stores the data check to make sure that data
satisfies the assumptions of the class.

Best,
- Joe
 
P

Paul McGuire

In this example atomicity is not guaranteed even if alist is a builtin
list (if it contains a complex number or other unorderable object),
let alone if not isistance(alist, list). It gets worse: false
positives are less likely for full-spelled methods with well-known
names such as "sort" and "append"  (i.e. if hasattr(alist, 'append'),
alist.append *probably* does what you think it does), but good luck
with that when testing for __getitem__, __iter__ for more than one
pass, __call__, and other special methods with ambiguous or undefined
semantics.
Let's face it, duck typing is great for small to medium complexity
projects but it doesn't scale without additional support in the form
of ABCs/interfaces, explicit type checking (at compile and/or run
time), design by contract, etc. It must not be a coincidence that both
Zope and Twisted had to come up with interfaces to manage their
massive (for Python at least) complexity.

What would be actually interesting would be an switch to the python
interpreter that internally annotated function parameters with how
they are used in the function and raised an exception as soon as the
function is called instead of later. Failing earlier rather than
later. Example:

def sub(x,y):
...run some stuff
...print x[2]
...return y.strip().replace('a','b')

internally python generates:

def sub(x: must have getitem, y: must have strip and replace)

sub([1,2,3,4],5)
Error calling sub(x,y): y has to have strip() method.- Hide quoted text -

- Show quoted text -

No, this would mean:
def sub(x: must have getitem, y: must have strip, and y.strip must
return something that has replace)

Or to be even more thorough:
def sub(x: must have getitem, y: must have strip and strip must be
callable, and y.strip must return something that has replace and
replace must be callable)

So even this simple example gets nasty in a hurry, let alone the OP's
case where he stuffs y into a list in order to access it much later,
in a completely different chunk of code, only to find out that y
doesn't support the complete string interface as he expected.

-- Paul
 
J

Joe Strout

Or to be even more thorough:
def sub(x: must have getitem, y: must have strip and strip must be
callable, and y.strip must return something that has replace and
replace must be callable)

So even this simple example gets nasty in a hurry, let alone the OP's
case where he stuffs y into a list in order to access it much later,
in a completely different chunk of code, only to find out that y
doesn't support the complete string interface as he expected.

Very true. That's why I think it's not worth trying to be too pure
about it. Most of the time, if a method wants a Duck, you're going to
just give it a Duck.

However, I would like to also handle the occasional case where I can't
give it a Duck, but I can give it something that is a drop-in
substitute for a Duck (really, truly, I promise, and if it blows up
I'll take responsibility for it).

A real-world example from another language (sorry for that, I've been
away from Python for ten years): in REALbasic, there is a Database
base class, and a subclass for each particular database backend
(Postgres, MySQL, whatever). This works fine most of the time, in
that you can write general code that takes a Database object and Does
Stuff with it.

However, all of those database backends are shipped by the vendor, or
by plugin authors -- you can't create a useful Database subclass
yourself, in RB code, because it has a private constructor. So you
end up making your own database class, but that can't be used with all
the code that expects a real Database object.

Of course, the framework design there is seriously flawed (Database
should have been an interface, or at the very least, had a protected
rather than private constructor). And in Python, there's no way to
prevent subclassing AFAIK, so this particular issue wouldn't come up.
But I still suspect that there may be times when I don't want to
subclass for some reason (maybe I'm using the Decorator or Adapter or
Bridge pattern). Yet I'm willing to guarantee that I've adhered to
the interface of another class, and will behave like it in any way
that matters.

So, the level of assertion that I want to make in a method that
expects a Duck is just that its parameter is either a Duck, or
something that the caller is claiming is just as good as a Duck. I'm
not trying to prevent any possible error; I'm trying to catch the
stupid errors where I inadvertently pass in something completely
different, not duck-like at all (probably because some other method
gave me a result I didn't realize it could produce).

So things like this should suffice:

# simple element
assert(is_stringlike(foo))
assert(is_numeric(foo))
assert(is_like(foo, Duck))

# sequence of elements
assert(seqof_stringlike(foo))
assert(seqof_numeric(foo))
assert(seqof_like(foo, Duck))
# (also "listof_" variants for asserting mutable sequence of whatever)

# dictionary of elements
assert(dictof_like(foo, str, int))

Hmm, I was already forced to change my approach by the time I got to
checking dictionaries. Perhaps a better formalism would be a "like"
method that takes an argument, and something that indicates the
desired type. This could be a tree if you want to check deeper into a
container. Maybe something like:

assert(fits(foo, dictlike(strlike, seqlike(intlike))))

which asserts that foo is something dictionary-like that maps string-
like things to something like a sequence of integer-like things. Most
cases would not be this complex, of course, but would be closer to

assert(fits(foo, strlike))

But this is still pretty ugly. Hmm. Maybe I'd better wait for
ABCs. :)

Cheers,
- Joe
 
G

George Sakkis

So things like this should suffice:

        # simple element
        assert(is_stringlike(foo))
        assert(is_numeric(foo))
        assert(is_like(foo, Duck))

        # sequence of elements
        assert(seqof_stringlike(foo))
        assert(seqof_numeric(foo))
        assert(seqof_like(foo, Duck))
        # (also "listof_" variants for asserting mutable sequence of whatever)

        # dictionary of elements
        assert(dictof_like(foo, str, int))

Hmm, I was already forced to change my approach by the time I got to  
checking dictionaries.  Perhaps a better formalism would be a "like"  
method that takes an argument, and something that indicates the  
desired type.  This could be a tree if you want to check deeper into a  
container.  Maybe something like:

        assert(fits(foo, dictlike(strlike, seqlike(intlike))))

which asserts that foo is something dictionary-like that maps string-
like things to something like a sequence of integer-like things.  Most  
cases would not be this complex, of course, but would be closer to

        assert(fits(foo, strlike))

But this is still pretty ugly.  Hmm.  Maybe I'd better wait for  
ABCs.  :)

You might also be interested in the typecheck module whose syntax
looks nicer, at least for the common cases: http://oakwinter.com/code/typecheck/dev/

George
 
S

Steve Holden

Joe said:
Or to be even more thorough:
def sub(x: must have getitem, y: must have strip and strip must be
callable, and y.strip must return something that has replace and
replace must be callable)

So even this simple example gets nasty in a hurry, let alone the OP's
case where he stuffs y into a list in order to access it much later,
in a completely different chunk of code, only to find out that y
doesn't support the complete string interface as he expected.

Very true. That's why I think it's not worth trying to be too pure
about it. Most of the time, if a method wants a Duck, you're going to
just give it a Duck.

However, I would like to also handle the occasional case where I can't
give it a Duck, but I can give it something that is a drop-in substitute
for a Duck (really, truly, I promise, and if it blows up I'll take
responsibility for it).
[...]

I suspect this reduces our difference to a disagreement about the
meaning of the word "occasional" ...

regards
Steve
 
S

Steve Holden

Joe said:
Or to be even more thorough:
def sub(x: must have getitem, y: must have strip and strip must be
callable, and y.strip must return something that has replace and
replace must be callable)

So even this simple example gets nasty in a hurry, let alone the OP's
case where he stuffs y into a list in order to access it much later,
in a completely different chunk of code, only to find out that y
doesn't support the complete string interface as he expected.

Very true. That's why I think it's not worth trying to be too pure
about it. Most of the time, if a method wants a Duck, you're going to
just give it a Duck.

However, I would like to also handle the occasional case where I can't
give it a Duck, but I can give it something that is a drop-in substitute
for a Duck (really, truly, I promise, and if it blows up I'll take
responsibility for it).
[...]

I suspect this reduces our difference to a disagreement about the
meaning of the word "occasional" ...

regards
Steve
 
S

Steven D'Aprano

But this class wasn't really that complex; it's just that it assumed all
the stuff it's being fed were strings (or could be treated as strings),
and I inadvertently fed it an NLTK.Tree node instead (not realizing that
a library method I was calling could return such a thing sometimes).

Guido has published a couple of metaclasses to get Eiffel-style pre- and
post-condition tests that may be useful for you:

http://www.python.org/doc/essays/metaclasses/

If you're interested in reading more about metaclasses, this is more
current:

http://www.python.org/download/releases/2.2.3/descrintro/


By the way, even Guido himself isn't immune to the tendency among Python
users to flame at anyone suggesting change to Python's model:

http://www.artima.com/weblogs/viewpost.jsp?thread=87182

(And that's a good thing. It would be really bad if the Python community
were slavishly and mindlessly Guido-fanboys as we're sometimes accused of
being.)
 
G

greg

Joe said:
So, in this case, the simplest solution was to have the method that
initially accepts and stores the data check to make sure that data
satisfies the assumptions of the class.

In hindsight, yes, but the trouble is that you can't
tell ahead of time which of the gazillion places in the
code that where you store things away in containers are
likely to cause a problem later.

I can't imagine myself writing code to check every
argument to every method to guard against this sort of
thing. If you're willing to do that, it's up to you,
but it's far from common practice in Python programming.
 
G

greg

Joe said:
So, the level of assertion that I want to make in a method that expects
a Duck is just that its parameter is either a Duck, or something that
the caller is claiming is just as good as a Duck.

I'm not sure, but I think the new ABC stuff in Py3 is
going to provide something like this, in that there will
be a way to declare that a class conforms to the Duck
interface even if it doesn't inherit from it. Then
you can just test isinstance(x, Duck).
 
A

Arnaud Delobelle

Duck typing...

For a while I thought the word _duck_ was used in the sense of _dodge_.
 
S

Steven D'Aprano

In hindsight, yes, but the trouble is that you can't tell ahead of time
which of the gazillion places in the code that where you store things
away in containers are likely to cause a problem later.

I can't imagine myself writing code to check every argument to every
method to guard against this sort of thing.

Which is, of course, the weakness of dynamic typed languages like Python.
With statically typed languages like Pascal and C, you can get the
compiler to check that for you (often at compile time), but at the cost
of a lot more effort up front. And with languages like Haskell, the type
inference engine can do much of that type checking without you needing to
make explicit type declarations.

If you're willing to do
that, it's up to you, but it's far from common practice in Python
programming.

True. It's generally more efficient for the programmer's time to let the
function or method fail where ever it happens to fail, rather than trying
to get it to fail up front. But the cost of this is that sometimes it's
*less* efficient for the programmer, because he has no idea where the
offending object was injected into the code.

I wonder whether the best solution is to include all your type checks
(isinstance or duck-typing) in the unit tests, so you can avoid paying
the cost of those tests at runtime? If your code passes the unit tests,
but fails with real data, then your unit tests aren't extensive enough.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top