Is isinstance always "considered harmful"?

J

Jordan Rastrick

Hi everyone,

Just a little issue that I've come across in Python where I'd be
interested to read the thoughts and opinions of the great thinkers and
programmers who frequent this newsgroup.

I've read arguments, here and elsewhere, to the effect that in Python
isinstance should be avoided like the plague, except in a few very
specific and narrow circumstances. Roughly speaking, due in part to
Python's dynamic nature its better to concern yourself only with the
interface an object provides, and not whether it happens to inherit
from a given base class.

The problem is, sometimes theres more thats important to what a method
does than just correct behaviour.

Specifically, consider the case of Linked Lists vs the array based
random access model Python uses for its lists. Given the ease of using
Python's inbuilt list type, I'm not sure if theres much call for Linked
Lists in the language - I don't even know if they are used anywhere in
the standard library (I had assumed the deque class was implemented
with a linked list, but a friend of mine pointed out circular arrays
are an equally likely possibility, and I even saw an elegant cookbook
recipe the other day that uses a Dictionary.)

Well, whats the difference between the two data structures? If you
consider only the interface, nothing - they provide exactly the same
service, sequential storage and access to data elements. However, any
programmer who passed Data Structures and Algorithms 101 knows theres a
critical difference between the performance of the various operations a
programmer might want to carry out, and you should choose one over
the other based on which operations you expect to be most common in
whatever problem youre solving.

Hiding the implementation is a great principle, but this is a case
where the underlying implementation really matters. The difference
between an 0(1) and 0(n) algorithm is not a matter of 'fine tuning' or
an issue of 'premature optimization'; its a critical design parameter.

So to get back to the original point of the post ;) Say you're writing,
in Python, the extend() method for a Linked List version of python's
builtin list. Its really important to know what kind of iterable youre
being passed in - because if its another Linked list, and you know it,
you can connect the two in 0(1) time; whereas any other arbitrary
iterable is going to take 0(n), as you're just going to have to append
the items one by one. Is this a case where use of isinstance, to see
exactly what kind of Iterable you have, can be justified?

def extend(self, elems):
if isinstance(elems, LinkedList):
# Point last to first
else:
for elem in elems: self.append(elem)
From memory this is the way its done in Java's collection API, or at
least it was the last time I looked at the source.

There are other solutions I can think of - perhaps the least hideous is
factoring out the 0(1), "point last to first" code in a seperated
method, __linkedExtend() or something, and then do something similar to
the above by using an exception, like this:

def extend(self, elems):
try:
self.__linkedExtend(elems)
catch NotALinkedListError:
for elem in elems: self.append(elem)

I dont know, I don't really like this (although it is more BAFP than
the first version, so maybe that makes it more Pythonic?). To me,
instanceof seems like the infimum of all possible evils in this case.

It'd be nice if I'd seen the source code for Python's builtin list to
see if any of these kind of considerations are taken into account there
(ultra fast array copying in C when extend is called on another list,
perhaps)? Luckily, one of the great gifts of Python is I can indeed
look at the source for the entire langauge at any time I want. So I'm
going to go read it (my first time, how exciting!), and in the
meantime, I'll let replies start accumulating froma whole lot of
people who are a lot smarter and more experience in Python than myself
:)

Several-weeks-in-and-still-liking-Python-better-than-any-other-previously-learned-language-ly
yours,
Jordan Rastrick
 
T

Terry Reedy

I've read arguments, here and elsewhere, to the effect that in Python
isinstance should be avoided like the plague, except in a few very
specific and narrow circumstances.

Putting it like this is rather extreme.
Roughly speaking, due in part to
Python's dynamic nature its better to concern yourself only with the
interface an object provides, and not whether it happens to inherit
from a given base class.

To my mind, your example of using isinstance to select a better (such as
speedier) subalgorithm for a special case is not just fine, but good
programming. (Selecting a subalgorithm that works more robustly is also a
good reason for special casing.) It is an internal matter whose externally
visible effect is to improve performance.

Using isinstance to unnecessarily narrow the domain is quite different. It
has the externally visible effect of degrading performance (to a nullity)
for arguments that the user might reasonably want to work.

Terry J. Reedy
 
L

Leif K-Brooks

Jordan said:
Say you're writing, in Python, the extend() method for a Linked List
version of python's builtin list. Its really important to know what
kind of iterable youre being passed in - because if its another
Linked list, and you know it, you can connect the two in 0(1) time;
whereas any other arbitrary iterable is going to take 0(n), as you're
just going to have to append the items one by one. Is this a case
where use of isinstance, to see exactly what kind of Iterable you
have, can be justified?

def extend(self, elems):
if isinstance(elems, LinkedList):
# Point last to first
else:
for elem in elems: self.append(elem)

Regardless of the various issues surrounding isinstance(), you have a
difference in functionality. Since you're just storing a reference in
the case of another LinkedList instead of copying it, mutating the
LinkedList will be different from mutating another iterable type which
has been passed to extend:
linkedlist1 = LinkedList()
list1 = [1, 2, 3]
linkedlist2 = LinkedList([4, 5, 6])
linkedlist1.extend(list1)
linkedlist1.extend(linkedlist2)
linkedlist1 LinkedList([1, 2, 3, 4, 5, 6])
list1.append(4)
linkedlist1 # Notice how there's still only one 4 LinkedList([1, 2, 3, 4, 5, 6])
linkedlist2.append(7)
linkedlist1 # Notice how there's now a 7
LinkedList([1, 2, 3, 4, 5, 6, 7])
 
B

Bengt Richter

Putting it like this is rather extreme.


To my mind, your example of using isinstance to select a better (such as
speedier) subalgorithm for a special case is not just fine, but good
programming. (Selecting a subalgorithm that works more robustly is also a
good reason for special casing.) It is an internal matter whose externally
visible effect is to improve performance.
I agree, but I am also a little uncomfortable about such performance tuning,
unless the assumptions it depends on are prominently documented or even
validated with an assert or explicit warning. Otherwise the next version
of the interpreter or a library module could change the optimal decision,
and a bad optimization decision could be locked in for the new version.

Maybe there should be another testable condition like __debug__ except for
testing (e.g. __testing__ ;-) which could be used to introduce temporary
alternative code (such as alternate optimization decisions) so that system
tests could be used to validate locking in one decision or another for a
new system version being tested.

For trivial personally maintained code, a one-line version check with a
reminder exception to re-visit the optimization or whatever decision
(and revise the version check for next time) could cheaply prevent hidden
lock-in of bad optimization etc.
Using isinstance to unnecessarily narrow the domain is quite different. It
has the externally visible effect of degrading performance (to a nullity)
for arguments that the user might reasonably want to work.
Agreed, but the key thing there is to define "unnecessarily" ;-)

Regards,
Bengt Richter
 
J

Jordan Rastrick

Leif said:
Regardless of the various issues surrounding isinstance(), you have a
difference in functionality. Since you're just storing a reference in
the case of another LinkedList instead of copying it, mutating the
LinkedList will be different from mutating another iterable type which
has been passed to extend:

Oops, didn't think of that. Good point.
linkedlist1 = LinkedList()
list1 = [1, 2, 3]
linkedlist2 = LinkedList([4, 5, 6])
linkedlist1.extend(list1)
linkedlist1.extend(linkedlist2)
linkedlist1 LinkedList([1, 2, 3, 4, 5, 6])
list1.append(4)
linkedlist1 # Notice how there's still only one 4 LinkedList([1, 2, 3, 4, 5, 6])
linkedlist2.append(7)
linkedlist1 # Notice how there's now a 7
LinkedList([1, 2, 3, 4, 5, 6, 7])

Out of curiousity, is this code real? Where does the LinkedList() class
come from?
 
J

Jordan Rastrick

Thanks for the replies, everyone. As is usual when reading
comp.lang.python, I got some invaluable exposure to new ideas (multiple
dispatching in particular seems to fill a gap I've felt in OO
programming in the past.)

I'm starting to think this newsgroup is in its own right an excellent
reason to get into Python - the fact its an awesome language is just an
added bouns ;-)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,739
Latest member
Clint8040

Latest Threads

Top