David MacQuigg said:
The setting of __self__ happens automatically, just like the setting
of the first argument in a call from an instance. The user doesn't
have to worry about it.
Explicit is better than implicit, especially in getting people to
understand things.
In fact, I can't think of a circumstance
where the user would need to explicitly set __self__. Maybe some
diagnostic code, in which case having available a system variable like
__self__ is a plus. You can, without any loss of functionality in a
normal program, never mention __self__ in an introductory course. The
user doesn't need to know what it is called.
I can think of a time where I'd both want to do it and couldn't;
may(str.strip, list_of_str)
See what I mean about it breaking the funtional side of things?
Without some definition about what's a static method the interpreter
can't resolve things like this. In practice, map and a whole bunch of
functions would have to be modified to explicitly set __self__.
An implicit self is what leads to the whole
mem_fun(&std::vector<int>:
ush_back) (or whatever the exact syntax
is, it's too unusable to bother learning) in c++.
My preference is to give it a name and highlight it with double
underscores. To me that makes the discussion more concrete and
explicit, and builds on concepts already understood. Don't forget,
the students already understand global variables at this point in the
course. The "magic" of setting a particular global variable to an
instance is about the same as the magic of inserting that instance as
a first argument in a function call. The problem in either syntax is
not the magic of setting 'self' or '__self__'.
The specialness of the first argument isn't much, I agree, but it is
enough to make the calling sequence different from a normal function
or a static method. It is these differences that the new syntax gets
rid of, thereby enabling the unification of all methods and functions,
and simplifying the presentation of OOP. Methods in the new syntax
are identical to functions (which the students already understand),
except for the presence of instance variables.
Instance variables are the one fundamental difference between
functions and methods, and one that we wish to focus our entire
attention on in the presentation.
Except that that distinction doesn't exist in Python, since calling an
instance variable is an explicit call to a member of an instance. If
you are trying to focus your presentation on something which doesn't
exist in Python then things are naturally going to be awkward. I
would suggest that it's not a problem with the language, though.
Any new and unnecessary syntactic
clutter is a distraction, particularly if the new syntax is used in
some cases (normal methods) but not others (static methods).
If you really want to get something like this accepted then the
closest thing which *might* have a chance would be to redefine the
calling sequence of some_class.method() so that it doesn't seek a self
argument so long as ``method'' was defined without one. I don't think
this would break anyone's code (I may well be wrong of course) since
the sequence isn't currently valid in execution (though IIRC it will
compile to bytecode).
This makes the distinction between static and normal methods about as
simple as it can be; methods are just class methods which operate on
an instance. You can show how it works in a few lines of an example
(though, currently, it should only take a few more.)
The Mammal.show() function *is* specific to Mammal. I think what you
are saying is that calling Mammal.show() results in a display of
characteristics of both Mammal and its ancestor Animal.
No, it's not. Let me try to be totally clear here; The numMammals
data member contains data not just about Mammal, but also about
instances of it's subclasses. This is the problem. The fact that
it's accessed through the show method is really just a detail, though
the presence of of show in other subclasses compounds the problem.
That is a
requirement of the problem we are solving, not a result of bad
programming. We want to see *all* the characteristics of Mammal,
including those it inherited from Animal.
You are not solving a problem; that's the problem.
If there were a
real programming task then it would be more trivial to show why your
object model is broken.
Leave out the call to Animal.show() if you don't want to also see the
ancestor's data.
The proposed Inventory() function is a general function that *would*
be appropriate outside a class. The exising class-specific functions
like Mammal.show() are unique to each class. I tried to make that
clear in a short example by giving each data item a different text
label. I've now added some unique data to the example just so we can
get past this stumbling block. A real program would have a multi-line
display for each class, and there would be *no way* you could come up
with some general function to produce that display for any class.
Learning Python, 2nd ed. would be appropriate for a one-semester
course. My problem is that I have only a fraction of a semester in a
circuit-design course. So I don't cover OOP at all. I would include
OOP if I could do it with four more hours. Currently Python is a
little over the top. I don't think it is a problem with Lutz's book.
He covers what he needs to, and at an appropriate pace.
If you can't take it below 70 pages and you only have 4 hours... maybe
it's not such a great idea to try this? I can't see your students
benefiting from what you're proposing to do, if you have so little
time.
These are concepts that design engineers understand very well. I
wouldn't spend any time teaching them about modularity, but I would
point out how different program structures facilitate modular design,
and how syntax can sometimes restrict your ability to modularize as
you see fit. Case in point: The need for static methods to put the
show() functions where we want them.
What data are we talking about? numMammals is specific to Mammal.
genus is specific to Feline, but *inherited* by instances of a
subclass like Cat.
The numAnimals etc... data, which is stored in Animals but gets
arbitrarily altered by the actions of subclasses of Animal, and
therefore is not specific to animal; it doesn't represent the state of
the Animal class or of Animal objects, but of a whole bunch of
subclasses of Animal.
These are normal programming errors that can occur in any program, no
matter how well structured. I don't see how the specific structure of
Animals.py encourages these errors.
Imagine if your structure had been implemented as one of the basic
structures of, say, Java. That is, some static data in the Object
class stores state for all the subclasses of Object. Now, someone
coming along and innocently creating a class can break Object -
meaning that may break anything with a dependency on Object, which is
the entire system. So I write a nice GUI widget and bang! by some
bizzare twist it breaks my program somewhere else because of an error
in, say, the StringBuffer class. This is analagous to what you are
implementing here.
While errors are always going to happen, OOP calls on some conventions
to minimize them. The most absolutely vital of these is that it's
clear what can break what. Generally I should never be able to break
a subsystem by breaking it's wrapper; definitely I should never be
able to break a superclass by breaking it's subclass; and I
*certainly* shouldn't be able to break a part of the system by
changing something unconnected to it. The whole of OOP derives, more
or less directly, from these principles. Expressions like 'A is a
part/type of B' derive from this philosophy, not the other way around.
Your program breaks with this concept. It allows an event in Cat to
affect data in Mammal and in Animal, which also has knock-on effects
for every other subclass of these. Therefore it is bad object
oriented programming.
It takes us back to the days before even structured programming, when
no-one ever had any idea what the effects of altering or adding a
piece of code would be.
It is therefore not a good teaching example.
The "inventory data" actually consists of independent pieces of data
from each class. ( numCats is a piece of inventory data from the Cat
class.) I'm sorry I just can't follow this.
numMammals OTOH is not just a piece of data from one class - it's a
piece of data stored in one class, but which stores data about events
in many different classes, all of which are outside it's scope.
Seems like this is the way it has to be if you want to increment the
counts for Cat and all its ancestors whenever you create a new
instance of Cat. Again, I'm not understanding the problem you are
seeing. You seem to be saying there should be only methods, not data,
stored in each class.
That's the way it has to be, if you want to write it like that.
However there is nothing to say that a given problem must use a
certain class structure. If you come up with a solution like this
then it's near-guaranteed that there was something badly wrong with
the way you modelled the domain. Either the program shouldn't need to
know the number of instances which ever existed of subclasses of
mammal or else your class structure is wrong.
And, as general rule, you should think carefully before using classes
to store data; that's typically what objects are for. I used static
data in programs quite a lot before I realised that it too-often bit
me later on.
To try and get to the bottom of this, I re-wrote the Animals.py
example, following what I think are your recommendations on moving the
static methods to module-level functions. I did not move the data out
of the classes, because that makes no sense to me at all.
*Sigh* No, I must say that doesn't help much. :-\
As I said, there is something wrong with the whole idea behind it; the
design needs refactoring, not individual lines of code.
Having said that, I'll try to redact the issues as best I can, on the
basis that it may illustrate what I mean.
OK: start with the basics. We need iterative counting data about the
individual elements of the heirarchy.
The first thing is that we need to factor out the print statements.
Your back-end data manipulation modules should never have UI elements
in them. So, whatever form the data manipulation comes in, it should
be abstract.
Secondly, we want to keep the data stored in each class local to that
class. So, Mammal can store the number of Mammals, if that turns out
to be a good solution, but not the number of it's subclasses. OTOH we
could remove the data from the classes altogether.
Thirdly, it would probably be nice if we had the ability to implement
the whole thing in multiple independant systems. Currently the design
only allows one of "whatever-we're-doing" at a time, which is almost
certainly bad.
After a bit of brainstorming this is what I came up with. It's not a
specific solution to your problem; instead it's a general one. The
following class may be sub-classed and an entire class-heirarchy can
be placed inside it. It will then generate automatically the code to
keep a track of and count the elements of the class heirarchy,
returning the data you want at a method call.
This is done with a standard OO tool, the Decorator pattern, but
ramped up with the awesome power of the Python class system.
class Collective:
class base: pass
def startup(self, coll, root):
#wrapper class to count creations of classes
self.root = root
class wrapper:
def __init__(self, name, c):
self.mycount = 0
self.c = c
self.name = name
def __call__(self, *arg):
tmp = self.c(*arg)
self.mycount += 1
return self.c(*arg)
self.wrapper = wrapper
#replace every class derived from root with a wrapper
#plus build a table of the
self.wrap_list = []
for name, elem in coll.__dict__.items():
try:
if issubclass(elem, self.root):
tmp = wrapper(name, elem)
self.__dict__[name] = tmp
self.wrap_list.append(tmp)
except: pass
#when subclassing, override this
#call startup with the class name
#and the root of the class heirarchy
def __init__(self):
self.startup(Collective, self.base)
#here's the stuff to do the counting
#this could be much faster with marginally more work
#exercise for the reader...
def get_counts(self, klass):
counts = [ (x.c, (self.get_sub_count(x), x.name)) \
for x in self.super_classes(klass) ]
counts.append( (klass.c, (self.get_sub_count(klass),
klass.name)) )
counts.sort(lambda x, y: issubclass(x[0], y[0]))
return [x[-1] for x in counts]
def get_sub_count(self, klass):
count = klass.mycount
for sub in self.sub_classes(klass):
count += sub.mycount
return count
def super_classes(self, klass):
return [x for x in self.wrap_list if issubclass(klass.c, x.c)
\
and not x.c is klass.c]
def sub_classes(self, klass):
return [x for x in self.wrap_list if issubclass(x.c, klass.c)
\
and not x.c is klass.c]
So we can now do:
class animal_farm(Collective):
class Animal: pass
class Mammal(Animal): pass
class Bovine(Mammal): pass
class Feline(Mammal): pass
class Cat(Feline): pass
def __init__(self):
self.startup(animal_farm, self.Animal)
a_farm = animal_farm()
cat = a_farm.Cat()
feline = a_farm.Mammal()
print a_farm.get_counts(a_farm.Feline)
[(2, 'Animal'), (2, 'Mammal'), (1, 'Feline')]
The above code is 51 lines with about 10 lines of comments. For a
project of any size, this is a heck of an investment; I believe it
would take a fairly determined idiot to break the system, and *most
importantly*, they would be able to trace back the cause from the
effect fairly easily.
Admittedly the solution is on the complicated side, though perhaps
someone with more experience than me could simplify things.
Unfortunately, a certain amount of complexity is just a reflection of
the fact that your demands strain the OO paradigm right to it's limit.
You could possibly implement the same thing in Java with a Factory
pattern, and perhaps the reflection API.
(Of course I'm none too sure I could do that after many years of
hacking Java vs a few weeks of Python!)