Unification of Methods and Functions

G

Greg Ewing

David said:
We disagree on what is "waffle" and what are beneficial extra words.
Your explanation, to me, seems more like a "man page" than a textbook
explanation.

The number of words used isn't really the issue here.
Any explanation can be expanded or contracted to adjust
the pace to the intended audience. What's important is
the structure of the explanation, and whether the structure
forms a logical flow of ideas that can be easily followed.

The original explanation you quoted, it seems to me, is
harder to follow than necessary because it takes a somewhat
tortuous path to explaining what is going on.

It starts out by saying "Some of the variables in
this function are prefixed by 'self.'...", as if something
very new and mysterious is being introduced. It then spends
the next couple of paragraphs explaining this new concept.

At the end of all this, the reader, if he/she is sharp
enough, will realise that it isn't really a new thing at all,
but just a particular application of something already
encountered, namely attribute access. If he/she isn't sharp
enough, he/she may be left thoroughly confused and still
thinking it's some new mysterious feature.

I was trying to show how the presentation could be made
easier to follow by starting from the other end, and
explaining how Python makes use of things already
encountered -- parameter passing and attribute access --
to provide instance variables.

In other words, your solution to badly-written tutorials
appears to be to redesign the language. Mine, on the other
hand, would be to re-write the tutorials so that they're
better.
 
G

Greg Ewing

David said:
Wow,
you mean staticmethods aren't fundamentally necessary, just a bandaid
to make up for Python's deficiencies?

In Python, staticmethods are not fundamentally necessary,
full stop. We keep trying to tell you that, but it seems
you have your fingers in your ears.

The only possible reason to use a staticmethod in Python
is if you want it to be in the class's namespace. But that's
not a need, it's only a want. There are other ways of
resolving namespace issues.

Python didn't even *have* staticmethods until very
recently, and programmers got on just fine without them.
The only reason they were added is that the new descriptor
mechanism made it easy to do so.

In the past, every now and then someone (coming from C++)
would ask how to create a static method in Python. They were
added simply to appease these people. Regular Python programmers
just carried on as usual and ignored staticmethods.

Even now, staticmethods could be removed from the language
entirely and very little would be affected.
 
G

Greg Ewing

David said:
Static methods
are necessary because it is a very natural thing to write a method in
a class that needs to work without being bound to any particular
instance.

I beg to differ. It might seem natural to someone who's
been exposed to C++ or Java, but I don't think it's
a priori a natural thing at all.

In Python, a function defined inside a class is a method.
If you want a function to be a method, you put it inside
a class. If you don't want it to be a method, you don't
put it inside a class. It's as simple as that.
 
D

Donn Cave

Quoth David MacQuigg <[email protected]>:
....
|> My own perspective is, present
|> the core language and leave them to solve problems in that model,
|> "bad programming" or not.
|
| This could be an argument for Perl. :>)

Yes, Perl is really aimed straight at this kind of person, and is
not a bad choice given a careful exposure.

Donn Cave, (e-mail address removed)
 
J

James Moughan

David MacQuigg said:
On 6 May 2004 08:37:50 -0700, (e-mail address removed) (James Moughan) wrote:

I don't want to argue implementation details, as I am no expert, but I
think you are saying something is wrong at the user level, and that
puzzles me.

A global function, if I understand your terminology correctly, is one
defined at the module level, outside of any class. Such a function
cannot have instance variables. If you were to reference that
function from within a class, it would just act as a normal function
(a static method in Python terminology). I can't see the problem.

Let me give an example:

def getLength(s): return s.length

class Foo:
length = 5

Foo.getLength = getLength

foo = Foo()
print length(foo), foo.length()

A method in a class in Python is just like a global function; for a
global function to operate on an object, it must take it as an
argument. The prototype syntax would appear to break the above
example.
The difference in the proposed syntax is that it doesn't need the
staticmethod wrapper to tell the interpreter -- don't expect a special
first argument. In the new syntax all functions/methods will have the
same calling sequence.

If a method doesn't operate on the data from an object then as a rule
it should be global. There are exceptions, but they generally don't
occur in Python so much as a in 'true oo' language like Java.
I've looked at a few introductions to Python, and in my opinion
Learning Python, 2nd ed, by Mark Lutz and David Ascher is the best.
It strikes a good balance between the minimal presentation that
tersely covers the essentials for an experienced programmer vs the
long, windy introductions that put you to sleep with analogies to car
parts and other "objects". Lutz takes 95 pages to cover OOP. I think
I could do a little better, maybe 70 pages, but that may be just my
ego :>)

When you say ten pages, you must be making some very different
assumptions about the students or their prior background, or the
desired level of proficiency. The least I can imagine is about 30
pages, if we include exercises and examples. And that 30 assumes we
get rid of all the unneccesary complexity (static methods, lambdas,
etc.) that currently fills the pages of Learning Python.

I'm assuming they alreay know the general structures of programming in
Python, and that you can then just show them how to package data and
methods into a class with a clear example, by rewriting a program
you've shown them before. After that it's mainly a question of
explaining why you should do it, which is probably rather more
important than how.

I've never met anyone who had difficulty in understanding anything
about the syntax of OO other than the class/object distinction. It's
fundamentally very simple once you have a basis in all the other
programming techniques.

Unless you're talking about the entire programming course, 70 pages is
waaay too much - your students just will not read them, regardless of
how brilliant they are.
We have the usual dose of terminology problems here. The term 'static
method' in Python may be different than in other languages. In Python
it just means a method that has no instance variables, no special
first argument, and an extra 'staticmethod' line, to tell the
interpreter not to expect a special first argument.

OK; no particular difference from Java/C++/etc.
Python does have "encapsulation" but does not have "hiding", by my
understanding of these words. The idea is that __private variables
are easily identified to avoid accidents, but there is no attempt to
stop deliberate access to such variables. This is a design philosophy
that I like.

*Nods* That's why I said 'worth anything'; the idea of encapsulation,
in theory, is to prevent screw-ups, which is why hiding is key. Not
that it's necessarily a good theory. :) I also like Python's
philosophy here.
This is a textbook introduction, not a real program. The purpose of
the example is to show a complete OOP hierarchy in a small example,
with a good selection of the method styles most needed in a real
program. The similarity between the show() methods in different
classes would not be so tempting to reduce to one global function if I
had made a larger example, with more radically different outputs from
each show function. I thought that just changing one string in each
function would be enough to say "This function is different."

OOP is a tool. It's abstraction makes it tempting to create arbitrary
structures as examples, but doing that looses any sense of the reason
why you would use that tool. That's why I see people with CS degrees
who can throw around objects and heirarchies at will but who can't
structure a simple program effectively.
You are not the only one who had this reaction. See my reply to Don
Cave above. I guess I need to thow in a little more "meat", so that
experienced programmers don't get distracted by the possibility of
making the whole program simpler by taking advantage of its
regularities. This is the same problem I've seen in many texts on
OOP. You really can't see the advantages of OOP in a short example if
you look at it with the attitude -- I can do that much more easily
without classes. It's when you get to really big complex hierarchies
that the advantages of OOP become clear.

Learning to program is about 5% how to do something, and 95% when and
why you should do it. You seem to be focusing almost exclusively on
how, which I suspect is why we're all so upset :) you get that way
when you have to fix the code which eventually results.

I thought this part was pretty clear. The show() method at each level
calls the show() method one level up, then adds its own stuff at the
end. Feline.show() calls Mammal.show(), which prints lots of stuff
characteristic of mammals, all in a format unique to the Mammal class.
Mammal.show() in turn calls Animal.show(). At each level we have some
unique display of characteristics. The purpose is to have a call at a
particular level show all characteristics of the animal from that
level up.

It is clear, just not a good idea.
Both responses I have on this are basically experts saying -- you can
solve this particular problem more easily by restructuring it. I
should have been more clear. Imagine that each of these classes and
methods is fully expanded with lots of parameters to make each one
unique. Don't even think about re-structuring, unless you are trying
to tell me that the whole idea of having these structures in any
program is wrong. That would surprise me, since this "animals"
example is a common illustration of OOP.

OK: "The whole idea of having these structures in any program is
wrong."

Firstly, the program uses a class hierarchy as a data structure. That
isn't what class heirarchies are designed for, and not how they should
be used IMO. But it's what any bright student will pick up from the
example.

Secondly, it breaks the entire concept of OOP. Objects are designed
to be individual entities with side-effects restricted to their scope,
in order to modularize programs.

In this example, you use side effects from one class to influence the
output of another; a bovine will end up influencing the output of a
cat, for example. And the effect in completely implicit. As a
result, someone who is introducing a mouse class can break another
part of your system without the faintest idea that they are affecting
it. It's a fragile structure leading to the almost inevitable
creation of the most intractable type of bug.

The fact that Python makes it hard to do is *good*.

What I'm looking for is not clever re-structuring, but just a
straightforward translation, and some comments along the way -- oh
yes, that is a little awkward having to use a staticmethod here. Wow,
you mean staticmethods aren't fundamentally necessary, just a bandaid
to make up for Python's deficiencies? That was my reaction when I
first saw Prothon.

Static methods are more like a band-aid to make up for the
deficiencies of OOP. Python isn't a pure OO language, and doesn't
suffer the need for them badly.
 
D

David MacQuigg

Let me give an example:

def getLength(s): return s.length

class Foo:
length = 5

Foo.getLength = getLength

foo = Foo()
print length(foo), foo.length()

The last two function calls don't work. There is no such function
named 'length' and foo.length is an integer, not a function.

Let me try to re-write this example as I think you intended:

def getLength(self): return self.length # a global function

class Foo:
length = 5

Foo.getLength = getLength

foo = Foo()

foolen = foo.getLength # a bound method
FooLen = Foo.getLength # an unbound method

print foolen() #=> prints '5'
print FooLen(foo) #=> prints '5'

The example in the proposed syntax would be the same, except there
would be no 'self' in the function definition, and there would be no
magic first argument 'foo' in the last call. Also, if you are calling
a function that has an instance variable ( .length ) and no instance
has been set by a prior binding, you would need to set __self__
manually.

__self__ = foo; print FooLen()

This is rarely needed. Normally a call to an unbound function would
be in a context where __self__ is already set. From the Animals_2.py
example: cat1.talk() calls Cat.talk() which calls Mammal.talk()
__self__ is set to cat1 on the first call, and it is not changed by
the call to the unbound function Mammal.talk()
A method in a class in Python is just like a global function; for a
global function to operate on an object, it must take it as an
argument. The prototype syntax would appear to break the above
example.

Global functions have no instance variables, so there is no need for a
special first argument. A Python method requires a special first
argument (even if it is not used). In the proposed syntax, global
functions and class methods would have the same calling sequence ( no
special first argument ). If the method has instance variables, it
will use the global __self__ object, which was set to whatever
instance called the method.

Again, in Animals.py: Cat.show() has no instance variables, so it just
ignores __self__. Cat.talk() has instance variables .name and .sound
-- so these are interpreted as __self__.name and __self__.sound

If a method doesn't operate on the data from an object then as a rule
it should be global. There are exceptions, but they generally don't
occur in Python so much as a in 'true oo' language like Java.

The placement of a function at the module level or in a class should
be determined by the nature of the function, not any syntax problems.
If the function has characteristics unique to a class, it ought to be
included with that class. The Mammal.show() function, for example,
provides a display of characteristics unique to mammals, so we put it
in class Mammal. We could have written a general-purpose Inventory()
function to recursively walk an arbitrary class hierarchy and print
the number of instances of each class. That general function would be
best placed at the global level, outside of any one class.
I'm assuming they alreay know the general structures of programming in
Python, and that you can then just show them how to package data and
methods into a class with a clear example, by rewriting a program
you've shown them before. After that it's mainly a question of
explaining why you should do it, which is probably rather more
important than how.

*Why* is not an issue with my students. They will have plenty of
examples in our circuit design program to look at. If anyone ever
asks "Why OOP", I'll just have them look at the Qt Toolkit, which will
be the basis of our user interface. Enormous complexity, packaged in
a very easy-to-use set of objects.

I need to show them *how* to undestand the OOP structures in that
program, and do it in the most efficient way. The chapter on
Prototypes at http://ece.arizona.edu/~edatools/Python/ is my best
effort so far. I've got the basic explanation down to 10 pages. I
expect this to expand to about 30 with examples and exercises. That
will be about four hours of additional work, including lecture,
reading, and practice.
I've never met anyone who had difficulty in understanding anything
about the syntax of OO other than the class/object distinction. It's
fundamentally very simple once you have a basis in all the other
programming techniques.

I guess you haven't met anyone learning C++. :>)
Unless you're talking about the entire programming course, 70 pages is
waaay too much - your students just will not read them, regardless of
how brilliant they are.

Learning Python, 2nd ed. by Mark Lutz and David Ascher is generally
considered the best introductory text on Python. 96 pages on OOP.

[snip a few paragraphs where we agree !! :>) ]
OOP is a tool. It's abstraction makes it tempting to create arbitrary
structures as examples, but doing that looses any sense of the reason
why you would use that tool. That's why I see people with CS degrees
who can throw around objects and heirarchies at will but who can't
structure a simple program effectively.


Learning to program is about 5% how to do something, and 95% when and
why you should do it. You seem to be focusing almost exclusively on
how, which I suspect is why we're all so upset :) you get that way
when you have to fix the code which eventually results.

The OOP presentations I've seen that focus as much as 50% on *why*
generally leave me bored and frustrated. I feel like screaming --
Stop talking about car parts and show me some nice code examples. If
it's useful, I'm motivated. Good style is a separate issue, also best
taught with good examples (and some bad for contrast).
It is clear, just not a good idea.


OK: "The whole idea of having these structures in any program is
wrong."

Firstly, the program uses a class hierarchy as a data structure. That
isn't what class heirarchies are designed for, and not how they should
be used IMO. But it's what any bright student will pick up from the
example.

The classes contain both data and functions. The data is specific to
each class. I even show an example of where the two-class first
example forced us to put some data at an inappropriate level, but with
a four class hierarchy, we can put each data item right where it
belongs.
Secondly, it breaks the entire concept of OOP. Objects are designed
to be individual entities with side-effects restricted to their scope,
in order to modularize programs.

In this example, you use side effects from one class to influence the
output of another; a bovine will end up influencing the output of a
cat, for example. And the effect in completely implicit. As a
result, someone who is introducing a mouse class can break another
part of your system without the faintest idea that they are affecting
it. It's a fragile structure leading to the almost inevitable
creation of the most intractable type of bug.

The fact that Python makes it hard to do is *good*.

Nothing in the Bovine class can affect anything in a Cat. Feline and
Bovine are independent branches below Mammal. Adding a Mouse class
anywhere other than in the chain Cat - Feline - Mammal - Animal cannot
affect Cat. Could you give a specific example?

I'm not sure what you mean by "side effects" here. The show()
function at each level is completely independent of the show()
function at another level. Chaining them together results in a
sequence of calls, and a sequence of outputs that is exactly what we
want. The nice thing about separating the total "show" functionality
into parts specific to each class is that when we add a class in the
middle, as I did with Feline, inserted between Mammal and Cat, it is
real easy to change the Cat class to accomodate the insertion.

Python has a 'super' function to facilitate this kind of chaining.
Michele Simionato's 'prototype.py' module makes 'super' even easier to
use. Instead of having Cat.show() call Mammal.show() I can now just
say super.show() and it will automatically call the show() function
from whatever class is the current parent. Then when I add a Feline
class between Mammal and Cat, I don't even need to change the
internals of Cat.
Static methods are more like a band-aid to make up for the
deficiencies of OOP. Python isn't a pure OO language, and doesn't
suffer the need for them badly.

In one syntax we need special "static methods" to handle calls where a
specific instance is not available, or not appropriate. In another
syntax we can do the same thing with one universal function form.

-- Dave
 
D

David MacQuigg

I beg to differ. It might seem natural to someone who's
been exposed to C++ or Java, but I don't think it's
a priori a natural thing at all.

I'm not fond of C++ or Java. Python is my first actual use of OOP.
In Python, a function defined inside a class is a method.
If you want a function to be a method, you put it inside
a class. If you don't want it to be a method, you don't
put it inside a class. It's as simple as that.

I really don't care if we call something a method or a function. What
I care about is that a program is well-structured and easy to
understand. I seem to be getting a lot of heat for putting the show()
functions in multiple classes in my Animals_2.py example. The
alternatives proposed seem to make the structure less clear.

Instead of having each part of the total show() functionality in its
appropriate class [1] I'm told that I need to put all of this at the
global level. One suggestion was to find some regularity in all the
show functions and write a global function to generate all outputs
based on that regularity. This won't work, because in general, there
is no such regularity.

Another possibility I have thought about is having a global function
that selects among four blocks, depending on the type of the instance
that called the show() function. If its a Cat, we need all four
output blocks. If its a Mammal, we need just Mammal and Animal. In
other words, replicate the inheritance behavior of classes with some
kind of if ... elif ... elif ... elif ... structure. This will be a
pain to write, compared to the elegance of just having each show()
function call its parent.

The real problem with these alternatives, however, is that it puts
stuff that belongs in a particular class, outside of that class. This
will make the code harder to understand, and harder to maintain. When
I need to add a Feline class between Mammal and Cat, I must remember
to edit the global show() function, adding stuff unique to Feline, and
changing the logic of the selections to include the new block. With a
large hierarchy of classes this is going to be a mess, certainly more
mess than just adding a few 'staticmethod' lines to the existing
classes.

Maybe I'm missing an option here. I don't claim to be an expert in
Python.

-- Dave

[Note 1] A function belongs in a class if the variables it works with
are unique to that class. If it works with variables from several
classes, it belongs in the class of a common ancestor. For optimum
modularity and encapsulation, we should put each function as low as
possible in the hierarchy of classes, without excessive duplication.
The tradeoff is you want to avoid crowding all your functions at the
top, and avoid unnecessary replication of the same functionality in
multiple classes at the bottom.

Example: If I had a function that needed only Cat variables, except
for one variable from Parrot, I would not put that function in the
Animal class. The better tradeoff is to leave it a Cat function, and
reach out with a fully-qualified reference to grab that one Parrot
variable. ( I might also put a ### flag on the line in Cat where that
outside variable is needed, so I don't later forget about this
external dependency.)
 
D

David MacQuigg

David MacQuigg wrote:
In Python, staticmethods are not fundamentally necessary,
full stop. We keep trying to tell you that, but it seems
you have your fingers in your ears.

I will stop this subthread right here unless you change your tone.
This kind of argument won't convince me you are right, or anyone else
with any intelligence.

-- Dave
 
J

James Moughan

David MacQuigg said:
The last two function calls don't work. There is no such function
named 'length' and foo.length is an integer, not a function.

Ooops, let me re-write it how I meant it :)


def getLength(s): return s.length

class Foo:
length = 5

Foo.getLength = getLength

foo = Foo()
print getLength(foo), Foo.getLength(foo), foo.getLength()

Obviously I meant getLength, not length, on the last line.
The example in the proposed syntax would be the same, except there
would be no 'self' in the function definition, and there would be no
magic first argument 'foo' in the last call.

foo isn't a 'magic argument'; I'm calling a function with a parameter,
which happens to be an object. This is normal. The example stresses
that methods are functions which just happen to be attached to a
class.
Also, if you are calling
a function that has an instance variable ( .length ) and no instance
has been set by a prior binding, you would need to set __self__
manually.
__self__ = foo; print FooLen()

???!!!???

This is what I was talking about in my first post, global variables
which change depending on where you are in the code... as I understand
what you're saying, __self__ will have to be set, then reset when a
method is called from within a method and the exits. And __self__
could presumably be changed halfway through a method, too. I'm sorry,
I don't see this as being more explicit or simpler.
This is rarely needed. Normally a call to an unbound function would
be in a context where __self__ is already set. From the Animals_2.py
example: cat1.talk() calls Cat.talk() which calls Mammal.talk()
__self__ is set to cat1 on the first call, and it is not changed by
the call to the unbound function Mammal.talk()



Global functions have no instance variables, so there is no need for a
special first argument. A Python method requires a special first
argument (even if it is not used).

But the first argument isn't terribly 'special'; it tells the method
what it's working on, just like any other argument. It's only
'special' characteristic is that there's some syntactic sugar to
convert foo.getLength() into Foo.getLength(foo).
The placement of a function at the module level or in a class should
be determined by the nature of the function, not any syntax problems.
If the function has characteristics unique to a class, it ought to be
included with that class. The Mammal.show() function, for example,
provides a display of characteristics unique to mammals, so we put it
in class Mammal. We could have written a general-purpose Inventory()
function to recursively walk an arbitrary class hierarchy and print
the number of instances of each class. That general function would be
best placed at the global level, outside of any one class.

Mammal.show() shows characteristics to do with Mammals, *but not
specifically Mammal*. There really is a difference between a class
and it's subclasses.

The general-purpose inventory solution would be a better solution. It
doesn't require repetition, it's hard (impossible?) to break and it's
generic, allowing it to be used beyond this single class heirarchy.

If the inventory function would be best placed outside a class, why do
you think it's a good idea to put something with exactly the same
functionality inside your classes?
*Why* is not an issue with my students. They will have plenty of
examples in our circuit design program to look at. If anyone ever
asks "Why OOP", I'll just have them look at the Qt Toolkit, which will
be the basis of our user interface. Enormous complexity, packaged in
a very easy-to-use set of objects.

Well, I don't know your course so I guess I shouldn't comment on what
you're doing. :)
I need to show them *how* to undestand the OOP structures in that
program, and do it in the most efficient way. The chapter on
Prototypes at http://ece.arizona.edu/~edatools/Python/ is my best
effort so far. I've got the basic explanation down to 10 pages. I
expect this to expand to about 30 with examples and exercises. That
will be about four hours of additional work, including lecture,
reading, and practice.


I guess you haven't met anyone learning C++. :>)

Lol, not who hadn't taken up Java beforehand, no. C++ does a
wonderful job of obscuring a simple language behind crazy syntax.
Learning Python, 2nd ed. by Mark Lutz and David Ascher is generally
considered the best introductory text on Python. 96 pages on OOP.

Books are always kind of strange, because a book must have a certain
number of pages and cover a certain range of content at a certain
technical level. For the level and range of the ORA Learning books,
that is going to mean a bit of padding for a simple language like
Python. If I see Learning Python in a bookshop then I'll take a look,
though.

Regardless, I stand by what I said before - students generally will
not read 70 pages on a single topic, especially when it's a relatively
minor part of the course.
[snip a few paragraphs where we agree !! :>) ]
Learning to program is about 5% how to do something, and 95% when and
why you should do it. You seem to be focusing almost exclusively on
how, which I suspect is why we're all so upset :) you get that way
when you have to fix the code which eventually results.

The OOP presentations I've seen that focus as much as 50% on *why*
generally leave me bored and frustrated. I feel like screaming --
Stop talking about car parts and show me some nice code examples. If
it's useful, I'm motivated. Good style is a separate issue, also best
taught with good examples (and some bad for contrast).

I'm not talking about car parts. I'm talking about explaining
modularity, complexity, side-effects, classes as data structures etc.

(It's hilarious to see what happens when people get taught by car-part
style metaphors; they take them completely literally. I've seen
someone writing the classic vending machine example write a 'Can'
class, subclass it to get 'CokeCan', 'PepsiCan'... and then create ten
of each to represent the machines' stock. That was after three years
of university, too...)
The classes contain both data and functions. The data is specific to
each class. I even show an example of where the two-class first
example forced us to put some data at an inappropriate level, but with
a four class hierarchy, we can put each data item right where it
belongs.

The data is not specific to the class. It's specific to the class and
it's subclasses. Subclasses should be dependent on the superclass,
and generally not the other way around.
Nothing in the Bovine class can affect anything in a Cat. Feline and
Bovine are independent branches below Mammal. Adding a Mouse class
anywhere other than in the chain Cat - Feline - Mammal - Animal cannot
affect Cat. Could you give a specific example?

Say someone adds a mouse class but doesn't call the constructor for
Mammal. The data produced by mammal and therefore cat is now
incorrect, as instances of mouse are not included in your count. In a
real example, anything might be hanging on that variable - so e.g.
someone adds some mouse instances and the program crashes with an
array index out of bounds (or whatever the Pythonic equivalent is :) )
, or maybe we just get bad user output. This type of behaviour is
damn-near impossible to debug in a complex program, because you didn't
change anything which could have caused it. It's caused by what you
didn't do.
I'm not sure what you mean by "side effects" here. The show()
function at each level is completely independent of the show()
function at another level. >

But the inventory data isn't independent. It's affected by classes
somewhere else in the heirarchy. Worse, it's done implicitly.
Chaining them together results in a
sequence of calls, and a sequence of outputs that is exactly what we
want. The nice thing about separating the total "show" functionality
into parts specific to each class is that when we add a class in the
middle, as I did with Feline, inserted between Mammal and Cat, it is
real easy to change the Cat class to accomodate the insertion.

Python has a 'super' function to facilitate this kind of chaining.
Michele Simionato's 'prototype.py' module makes 'super' even easier to
use. Instead of having Cat.show() call Mammal.show() I can now just
say super.show() and it will automatically call the show() function
from whatever class is the current parent. Then when I add a Feline
class between Mammal and Cat, I don't even need to change the
internals of Cat.

That's fine - providing you're not using a class heirarchy to store
data. It's not the act of calling a method in a super-class which is
a bad idea, it's the way you are making *the numbers outputted* from
cat dependent of actions taken *or not taken* in another class
*completely outside cat's scope*.
 
D

David MacQuigg

???!!!???

This is what I was talking about in my first post, global variables
which change depending on where you are in the code... as I understand
what you're saying, __self__ will have to be set, then reset when a
method is called from within a method and the exits. And __self__
could presumably be changed halfway through a method, too. I'm sorry,
I don't see this as being more explicit or simpler.

The setting of __self__ happens automatically, just like the setting
of the first argument in a call from an instance. The user doesn't
have to worry about it. In fact, I can't think of a circumstance
where the user would need to explicitly set __self__. Maybe some
diagnostic code, in which case having available a system variable like
__self__ is a plus. You can, without any loss of functionality in a
normal program, never mention __self__ in an introductory course. The
user doesn't need to know what it is called.

My preference is to give it a name and highlight it with double
underscores. To me that makes the discussion more concrete and
explicit, and builds on concepts already understood. Don't forget,
the students already understand global variables at this point in the
course. The "magic" of setting a particular global variable to an
instance is about the same as the magic of inserting that instance as
a first argument in a function call. The problem in either syntax is
not the magic of setting 'self' or '__self__'.

But the first argument isn't terribly 'special'; it tells the method
what it's working on, just like any other argument. It's only
'special' characteristic is that there's some syntactic sugar to
convert foo.getLength() into Foo.getLength(foo).

The specialness of the first argument isn't much, I agree, but it is
enough to make the calling sequence different from a normal function
or a static method. It is these differences that the new syntax gets
rid of, thereby enabling the unification of all methods and functions,
and simplifying the presentation of OOP. Methods in the new syntax
are identical to functions (which the students already understand),
except for the presence of instance variables.

Instance variables are the one fundamental difference between
functions and methods, and one that we wish to focus our entire
attention on in the presentation. Any new and unnecessary syntactic
clutter is a distraction, particularly if the new syntax is used in
some cases (normal methods) but not others (static methods).
Mammal.show() shows characteristics to do with Mammals, *but not
specifically Mammal*. There really is a difference between a class
and it's subclasses.

The Mammal.show() function *is* specific to Mammal. I think what you
are saying is that calling Mammal.show() results in a display of
characteristics of both Mammal and its ancestor Animal. That is a
requirement of the problem we are solving, not a result of bad
programming. We want to see *all* the characteristics of Mammal,
including those it inherited from Animal.

Leave out the call to Animal.show() if you don't want to also see the
ancestor's data.
The general-purpose inventory solution would be a better solution. It
doesn't require repetition, it's hard (impossible?) to break and it's
generic, allowing it to be used beyond this single class heirarchy.

If the inventory function would be best placed outside a class, why do
you think it's a good idea to put something with exactly the same
functionality inside your classes?

The proposed Inventory() function is a general function that *would*
be appropriate outside a class. The exising class-specific functions
like Mammal.show() are unique to each class. I tried to make that
clear in a short example by giving each data item a different text
label. I've now added some unique data to the example just so we can
get past this stumbling block. A real program would have a multi-line
display for each class, and there would be *no way* you could come up
with some general function to produce that display for any class.

Books are always kind of strange, because a book must have a certain
number of pages and cover a certain range of content at a certain
technical level. For the level and range of the ORA Learning books,
that is going to mean a bit of padding for a simple language like
Python. If I see Learning Python in a bookshop then I'll take a look,
though.

Regardless, I stand by what I said before - students generally will
not read 70 pages on a single topic, especially when it's a relatively
minor part of the course.

Learning Python, 2nd ed. would be appropriate for a one-semester
course. My problem is that I have only a fraction of a semester in a
circuit-design course. So I don't cover OOP at all. I would include
OOP if I could do it with four more hours. Currently Python is a
little over the top. I don't think it is a problem with Lutz's book.
He covers what he needs to, and at an appropriate pace.
I'm not talking about car parts. I'm talking about explaining
modularity, complexity, side-effects, classes as data structures etc.

These are concepts that design engineers understand very well. I
wouldn't spend any time teaching them about modularity, but I would
point out how different program structures facilitate modular design,
and how syntax can sometimes restrict your ability to modularize as
you see fit. Case in point: The need for static methods to put the
show() functions where we want them.
(It's hilarious to see what happens when people get taught by car-part
style metaphors; they take them completely literally. I've seen
someone writing the classic vending machine example write a 'Can'
class, subclass it to get 'CokeCan', 'PepsiCan'... and then create ten
of each to represent the machines' stock. That was after three years
of university, too...)

Oh ... don't get me started on academia. :>)
The data is not specific to the class. It's specific to the class and
it's subclasses. Subclasses should be dependent on the superclass,
and generally not the other way around.

What data are we talking about? numMammals is specific to Mammal.
genus is specific to Feline, but *inherited* by instances of a
subclass like Cat.
Say someone adds a mouse class but doesn't call the constructor for
Mammal. The data produced by mammal and therefore cat is now
incorrect, as instances of mouse are not included in your count. In a
real example, anything might be hanging on that variable - so e.g.
someone adds some mouse instances and the program crashes with an
array index out of bounds (or whatever the Pythonic equivalent is :) )
, or maybe we just get bad user output. This type of behaviour is
damn-near impossible to debug in a complex program, because you didn't
change anything which could have caused it. It's caused by what you
didn't do.

These are normal programming errors that can occur in any program, no
matter how well structured. I don't see how the specific structure of
Animals.py encourages these errors.
But the inventory data isn't independent. It's affected by classes
somewhere else in the heirarchy. Worse, it's done implicitly.

The "inventory data" actually consists of independent pieces of data
from each class. ( numCats is a piece of inventory data from the Cat
class.) I'm sorry I just can't follow this.
That's fine - providing you're not using a class heirarchy to store
data. It's not the act of calling a method in a super-class which is
a bad idea, it's the way you are making *the numbers outputted* from
cat dependent of actions taken *or not taken* in another class
*completely outside cat's scope*.

Seems like this is the way it has to be if you want to increment the
counts for Cat and all its ancestors whenever you create a new
instance of Cat. Again, I'm not understanding the problem you are
seeing. You seem to be saying there should be only methods, not data,
stored in each class.

To try and get to the bottom of this, I re-wrote the Animals.py
example, following what I think are your recommendations on moving the
static methods to module-level functions. I did not move the data out
of the classes, because that makes no sense to me at all.

Take a look at http://ece.arizona.edu/~edatools/Python/Exercises/ and
let me know if Animals_2b.py is what you had in mind. If not, can you
edit it to show me what you mean?

-- Dave
 
G

Greg Ewing

David said:
I will stop this subthread right here unless you change your tone.

Sorry, I take that back.

I stand by the point I was trying to make, though. You
don't need to use static methods at all in a language,
like Python, which doesn't force you to make all functions
a method of something.
 
D

David MacQuigg

Sorry, I take that back.

I stand by the point I was trying to make, though. You
don't need to use static methods at all in a language,
like Python, which doesn't force you to make all functions
a method of something.

Apology accepted. You are welcome to participate in the discussion of
the best way to structure this introductory Animals.py example. I'm
getting some suggetions from James Moughan, and I put t
 
D

David MacQuigg

Sorry, I take that back.

I stand by the point I was trying to make, though. You
don't need to use static methods at all in a language,
like Python, which doesn't force you to make all functions
a method of something.

Apology accepted. You are welcome to participate in the discussion of
the best way to structure this introductory Animals.py example. See
the discussion with James Moughan in another part of this thread.

I decided to write this as an exercise for the class. See
http://ece.arizona.edu/~edatools/Python/Exercises

Your suggestions will be appreciated.

-- Dave
 
J

James Moughan

David MacQuigg said:
The setting of __self__ happens automatically, just like the setting
of the first argument in a call from an instance. The user doesn't
have to worry about it.

Explicit is better than implicit, especially in getting people to
understand things. :)
In fact, I can't think of a circumstance
where the user would need to explicitly set __self__. Maybe some
diagnostic code, in which case having available a system variable like
__self__ is a plus. You can, without any loss of functionality in a
normal program, never mention __self__ in an introductory course. The
user doesn't need to know what it is called.

I can think of a time where I'd both want to do it and couldn't;

may(str.strip, list_of_str)

See what I mean about it breaking the funtional side of things?
Without some definition about what's a static method the interpreter
can't resolve things like this. In practice, map and a whole bunch of
functions would have to be modified to explicitly set __self__.

An implicit self is what leads to the whole
mem_fun(&std::vector<int>::push_back) (or whatever the exact syntax
is, it's too unusable to bother learning) in c++.

My preference is to give it a name and highlight it with double
underscores. To me that makes the discussion more concrete and
explicit, and builds on concepts already understood. Don't forget,
the students already understand global variables at this point in the
course. The "magic" of setting a particular global variable to an
instance is about the same as the magic of inserting that instance as
a first argument in a function call. The problem in either syntax is
not the magic of setting 'self' or '__self__'.



The specialness of the first argument isn't much, I agree, but it is
enough to make the calling sequence different from a normal function
or a static method. It is these differences that the new syntax gets
rid of, thereby enabling the unification of all methods and functions,
and simplifying the presentation of OOP. Methods in the new syntax
are identical to functions (which the students already understand),
except for the presence of instance variables.

Instance variables are the one fundamental difference between
functions and methods, and one that we wish to focus our entire
attention on in the presentation.

Except that that distinction doesn't exist in Python, since calling an
instance variable is an explicit call to a member of an instance. If
you are trying to focus your presentation on something which doesn't
exist in Python then things are naturally going to be awkward. I
would suggest that it's not a problem with the language, though. :)
Any new and unnecessary syntactic
clutter is a distraction, particularly if the new syntax is used in
some cases (normal methods) but not others (static methods).

If you really want to get something like this accepted then the
closest thing which *might* have a chance would be to redefine the
calling sequence of some_class.method() so that it doesn't seek a self
argument so long as ``method'' was defined without one. I don't think
this would break anyone's code (I may well be wrong of course) since
the sequence isn't currently valid in execution (though IIRC it will
compile to bytecode).

This makes the distinction between static and normal methods about as
simple as it can be; methods are just class methods which operate on
an instance. You can show how it works in a few lines of an example
(though, currently, it should only take a few more.)
The Mammal.show() function *is* specific to Mammal. I think what you
are saying is that calling Mammal.show() results in a display of
characteristics of both Mammal and its ancestor Animal.

No, it's not. Let me try to be totally clear here; The numMammals
data member contains data not just about Mammal, but also about
instances of it's subclasses. This is the problem. The fact that
it's accessed through the show method is really just a detail, though
the presence of of show in other subclasses compounds the problem.
That is a
requirement of the problem we are solving, not a result of bad
programming. We want to see *all* the characteristics of Mammal,
including those it inherited from Animal.

You are not solving a problem; that's the problem. :) If there were a
real programming task then it would be more trivial to show why your
object model is broken.
Leave out the call to Animal.show() if you don't want to also see the
ancestor's data.


The proposed Inventory() function is a general function that *would*
be appropriate outside a class. The exising class-specific functions
like Mammal.show() are unique to each class. I tried to make that
clear in a short example by giving each data item a different text
label. I've now added some unique data to the example just so we can
get past this stumbling block. A real program would have a multi-line
display for each class, and there would be *no way* you could come up
with some general function to produce that display for any class.


Learning Python, 2nd ed. would be appropriate for a one-semester
course. My problem is that I have only a fraction of a semester in a
circuit-design course. So I don't cover OOP at all. I would include
OOP if I could do it with four more hours. Currently Python is a
little over the top. I don't think it is a problem with Lutz's book.
He covers what he needs to, and at an appropriate pace.

If you can't take it below 70 pages and you only have 4 hours... maybe
it's not such a great idea to try this? I can't see your students
benefiting from what you're proposing to do, if you have so little
time.
These are concepts that design engineers understand very well. I
wouldn't spend any time teaching them about modularity, but I would
point out how different program structures facilitate modular design,
and how syntax can sometimes restrict your ability to modularize as
you see fit. Case in point: The need for static methods to put the
show() functions where we want them.


What data are we talking about? numMammals is specific to Mammal.
genus is specific to Feline, but *inherited* by instances of a
subclass like Cat.

The numAnimals etc... data, which is stored in Animals but gets
arbitrarily altered by the actions of subclasses of Animal, and
therefore is not specific to animal; it doesn't represent the state of
the Animal class or of Animal objects, but of a whole bunch of
subclasses of Animal.
These are normal programming errors that can occur in any program, no
matter how well structured. I don't see how the specific structure of
Animals.py encourages these errors.

Imagine if your structure had been implemented as one of the basic
structures of, say, Java. That is, some static data in the Object
class stores state for all the subclasses of Object. Now, someone
coming along and innocently creating a class can break Object -
meaning that may break anything with a dependency on Object, which is
the entire system. So I write a nice GUI widget and bang! by some
bizzare twist it breaks my program somewhere else because of an error
in, say, the StringBuffer class. This is analagous to what you are
implementing here.

While errors are always going to happen, OOP calls on some conventions
to minimize them. The most absolutely vital of these is that it's
clear what can break what. Generally I should never be able to break
a subsystem by breaking it's wrapper; definitely I should never be
able to break a superclass by breaking it's subclass; and I
*certainly* shouldn't be able to break a part of the system by
changing something unconnected to it. The whole of OOP derives, more
or less directly, from these principles. Expressions like 'A is a
part/type of B' derive from this philosophy, not the other way around.

Your program breaks with this concept. It allows an event in Cat to
affect data in Mammal and in Animal, which also has knock-on effects
for every other subclass of these. Therefore it is bad object
oriented programming.

It takes us back to the days before even structured programming, when
no-one ever had any idea what the effects of altering or adding a
piece of code would be.

It is therefore not a good teaching example. :)

The "inventory data" actually consists of independent pieces of data
from each class. ( numCats is a piece of inventory data from the Cat
class.) I'm sorry I just can't follow this.

numMammals OTOH is not just a piece of data from one class - it's a
piece of data stored in one class, but which stores data about events
in many different classes, all of which are outside it's scope.
Seems like this is the way it has to be if you want to increment the
counts for Cat and all its ancestors whenever you create a new
instance of Cat. Again, I'm not understanding the problem you are
seeing. You seem to be saying there should be only methods, not data,
stored in each class.

That's the way it has to be, if you want to write it like that.
However there is nothing to say that a given problem must use a
certain class structure. If you come up with a solution like this
then it's near-guaranteed that there was something badly wrong with
the way you modelled the domain. Either the program shouldn't need to
know the number of instances which ever existed of subclasses of
mammal or else your class structure is wrong.

And, as general rule, you should think carefully before using classes
to store data; that's typically what objects are for. I used static
data in programs quite a lot before I realised that it too-often bit
me later on.
To try and get to the bottom of this, I re-wrote the Animals.py
example, following what I think are your recommendations on moving the
static methods to module-level functions. I did not move the data out
of the classes, because that makes no sense to me at all.

*Sigh* No, I must say that doesn't help much. :-\



As I said, there is something wrong with the whole idea behind it; the
design needs refactoring, not individual lines of code.

Having said that, I'll try to redact the issues as best I can, on the
basis that it may illustrate what I mean.

OK: start with the basics. We need iterative counting data about the
individual elements of the heirarchy.

The first thing is that we need to factor out the print statements.
Your back-end data manipulation modules should never have UI elements
in them. So, whatever form the data manipulation comes in, it should
be abstract.

Secondly, we want to keep the data stored in each class local to that
class. So, Mammal can store the number of Mammals, if that turns out
to be a good solution, but not the number of it's subclasses. OTOH we
could remove the data from the classes altogether.

Thirdly, it would probably be nice if we had the ability to implement
the whole thing in multiple independant systems. Currently the design
only allows one of "whatever-we're-doing" at a time, which is almost
certainly bad.

After a bit of brainstorming this is what I came up with. It's not a
specific solution to your problem; instead it's a general one. The
following class may be sub-classed and an entire class-heirarchy can
be placed inside it. It will then generate automatically the code to
keep a track of and count the elements of the class heirarchy,
returning the data you want at a method call.

This is done with a standard OO tool, the Decorator pattern, but
ramped up with the awesome power of the Python class system. :)

class Collective:
class base: pass

def startup(self, coll, root):
#wrapper class to count creations of classes
self.root = root
class wrapper:
def __init__(self, name, c):
self.mycount = 0
self.c = c
self.name = name
def __call__(self, *arg):
tmp = self.c(*arg)
self.mycount += 1
return self.c(*arg)
self.wrapper = wrapper
#replace every class derived from root with a wrapper
#plus build a table of the
self.wrap_list = []
for name, elem in coll.__dict__.items():
try:
if issubclass(elem, self.root):
tmp = wrapper(name, elem)
self.__dict__[name] = tmp
self.wrap_list.append(tmp)
except: pass

#when subclassing, override this
#call startup with the class name
#and the root of the class heirarchy
def __init__(self):
self.startup(Collective, self.base)

#here's the stuff to do the counting
#this could be much faster with marginally more work
#exercise for the reader... ;)

def get_counts(self, klass):
counts = [ (x.c, (self.get_sub_count(x), x.name)) \
for x in self.super_classes(klass) ]
counts.append( (klass.c, (self.get_sub_count(klass),
klass.name)) )
counts.sort(lambda x, y: issubclass(x[0], y[0]))
return [x[-1] for x in counts]

def get_sub_count(self, klass):
count = klass.mycount
for sub in self.sub_classes(klass):
count += sub.mycount
return count
def super_classes(self, klass):
return [x for x in self.wrap_list if issubclass(klass.c, x.c)
\
and not x.c is klass.c]
def sub_classes(self, klass):
return [x for x in self.wrap_list if issubclass(x.c, klass.c)
\
and not x.c is klass.c]

So we can now do:

class animal_farm(Collective):
class Animal: pass
class Mammal(Animal): pass
class Bovine(Mammal): pass
class Feline(Mammal): pass
class Cat(Feline): pass
def __init__(self):
self.startup(animal_farm, self.Animal)


a_farm = animal_farm()
cat = a_farm.Cat()
feline = a_farm.Mammal()
print a_farm.get_counts(a_farm.Feline)
[(2, 'Animal'), (2, 'Mammal'), (1, 'Feline')]


The above code is 51 lines with about 10 lines of comments. For a
project of any size, this is a heck of an investment; I believe it
would take a fairly determined idiot to break the system, and *most
importantly*, they would be able to trace back the cause from the
effect fairly easily.

Admittedly the solution is on the complicated side, though perhaps
someone with more experience than me could simplify things.
Unfortunately, a certain amount of complexity is just a reflection of
the fact that your demands strain the OO paradigm right to it's limit.
You could possibly implement the same thing in Java with a Factory
pattern, and perhaps the reflection API.

(Of course I'm none too sure I could do that after many years of
hacking Java vs a few weeks of Python!)
 
A

Antoon Pardon

Op 2004-05-10 said:
Explicit is better than implicit, especially in getting people to
understand things. :)

I'm a bit sick of this argument. There is a lot om implicity
going on in python. if obj belongs to cls then obj.method()
is syntactic sugar for cls.method(obj). That looks like
a big implicite way to handle things in python.

Likewise all those magical methods to emulate numerical
types or containers or sequences or all ways that allow
the python programmer to do things implicitely throught
the use of operators instead of through explicitely calling
the desired method.

If implicit is such a negative thing to be avoided whenever
possible, python would look a lot different and I probably
wouldn't be using it.
 
J

James Moughan

Antoon Pardon said:
I'm a bit sick of this argument. There is a lot om implicity
going on in python. if obj belongs to cls then obj.method()
is syntactic sugar for cls.method(obj).

Well, it is... and it isn't. ``method'' could be bound to cls, cls's
super-class, or to obj itself.
That looks like
a big implicite way to handle things in python.

You could look at it like that. I'd say it's a way of making code
generic, and hiding the workings of a subsystem; lots of good things
in Python wouldn't be possible without this type of abstraction.
Likewise all those magical methods to emulate numerical
types or containers or sequences or all ways that allow
the python programmer to do things implicitely throught
the use of operators instead of through explicitely calling
the desired method.

If implicit is such a negative thing to be avoided whenever
possible, python would look a lot different and I probably
wouldn't be using it.

Sure; you have to have some understood conventions. In effect that's
all any higher-level programming language than assembler is. IMO, in
the case in point, the explicit self makes OO easier to undersstand;
YMMV, of course, as it does with Dave. :)
 
D

David MacQuigg

You are not solving a problem; that's the problem. :) If there were a
real programming task then it would be more trivial to show why your
object model is broken.

I could give you an example from IC Design, but for the course I
teach, I chose to use a similar hierarchy based on something everyone
would understand - a taxonomy of animals. Nothing in this example is
something you wouldn't find in a real program to model an integrated
circuit. Instead of animal names like Cat, we would have the names of
cells in the hierarchy, names like bgref25a. Instead of a variable to
count the number of animals at each level, we might have several
variables to track the total current on each of several supply lines.
Like the counts in the Animals.py hierarchy, we need the total current
to each cell, including all of its subcells.

I'm sure there are other examples from other specialties. In
accounting, I can imagine a hierarchy of accounts, with a total for
each account including all of its subaccounts. Don't just assume that
the problem isn't real because you haven't encountered it in your
work.

If you can't take it below 70 pages and you only have 4 hours... maybe
it's not such a great idea to try this? I can't see your students
benefiting from what you're proposing to do, if you have so little
time.

I think I could do it in 30 pages and 4 hours total ( lecture, lab,
and homework ), but not if I need to cover the topics that both Mark
Lutz and I consider important to basic OOP in the current version of
Python. The 30 pages assumes the unification of methods and functions
that I have proposed.

The numAnimals etc... data, which is stored in Animals but gets
arbitrarily altered by the actions of subclasses of Animal, and
therefore is not specific to animal; it doesn't represent the state of
the Animal class or of Animal objects, but of a whole bunch of
subclasses of Animal.

The total current to an IC is the sum of the currents to all of its
subcircuits. That current is a single number, for example, 35
microamps. It has a name "Iss". Iss is a characteristic of the IC
which appears in data sheets, etc. It is a variable representing the
state of the entire IC. It does not represent the state of any
subcircuit in the IC, even though it gets "altered" whenever one of
those subcircuit currents changes.

Looks like this whole argument comes down to what we mean by the word
"specific". Let's drop it and focus on the more interesting topics in
this thread.
Imagine if your structure had been implemented as one of the basic
structures of, say, Java. That is, some static data in the Object
class stores state for all the subclasses of Object. Now, someone
coming along and innocently creating a class can break Object -
meaning that may break anything with a dependency on Object, which is
the entire system. So I write a nice GUI widget and bang! by some
bizzare twist it breaks my program somewhere else because of an error
in, say, the StringBuffer class. This is analagous to what you are
implementing here.

I'll need an example to see how these general worries can affect the
Animals_2 hierarchy. What I see is quite robust. I added a Feline
class between Mammal and Cat, and I had to change only two lines in
the Cat class. ( And I could avoid even that if I had used a "super"
call instead of a direct call to the Mammal functions.)
While errors are always going to happen, OOP calls on some conventions
to minimize them. The most absolutely vital of these is that it's
clear what can break what. Generally I should never be able to break
a subsystem by breaking it's wrapper; definitely I should never be
able to break a superclass by breaking it's subclass; and I
*certainly* shouldn't be able to break a part of the system by
changing something unconnected to it. The whole of OOP derives, more
or less directly, from these principles. Expressions like 'A is a
part/type of B' derive from this philosophy, not the other way around.

Sounds good.
Your program breaks with this concept. It allows an event in Cat to
affect data in Mammal and in Animal, which also has knock-on effects
for every other subclass of these. Therefore it is bad object
oriented programming.

We are modeling the real world here. When you add a lion to a zoo,
you add one to the count of all animals. When you add 2 microamps to
the core currents in a bandgap voltage reference, you add that same 2
microamps to the total supply current.

I'm no expert in OOP, but what I have seen so far is not near as clear
in structure as the origninal Animals_2 example.
It takes us back to the days before even structured programming, when
no-one ever had any idea what the effects of altering or adding a
piece of code would be.

It is therefore not a good teaching example. :)

I'll need to see something better before I abandon the curent example.
The problem may be our expectations of OOP. I see classes as modeling
the real world, including variables that are altered by changes in
subclasses. You seem to have some computer science notion of what a
class should be. I'm not saying its wrong, but unless it helps me
solve my real-world problems, in a better way than what I am doing
now, I won't use it.

I'm reminded of the criticism Linus Torvalds got when he first
published Linux. The academic community thought it was the worst,
most fundamentally flawed design they had ever seen. It did not fit
some expectation they had that a "microkernel" architecture was the
proper way to design an OS. Luckily, Mr. Torvalds was not dependent
on their approval, and had the confidence to move ahead.
numMammals OTOH is not just a piece of data from one class - it's a
piece of data stored in one class, but which stores data about events
in many different classes, all of which are outside it's scope.

Exactly as we see in objects in the real world.
That's the way it has to be, if you want to write it like that.
However there is nothing to say that a given problem must use a
certain class structure. If you come up with a solution like this
then it's near-guaranteed that there was something badly wrong with
the way you modelled the domain. Either the program shouldn't need to
know the number of instances which ever existed of subclasses of
mammal or else your class structure is wrong.

Trust me, the need is real. We just need to find the optimum example
to show how Python solves the problem.

In my work as a software product engineer, I've learned to deal with
two very common criticisms. 1) The user doesn't need to do that. 2)
The user is an idiot for not understanding our wonderful methodology.
These are generally irrefutable arguments that can only be trumped by
a customer with a big checkbook. I generally don't engage in these
arguments, but on one occasion, I couldn't resist. I was trying to
show an expert how a complicated feature could be done much more
easily with simpler functions we already had in our program.

His argument was basically -- every expert in this company disagrees
with you, and you're an idiot for not understanding how our new
feature works. I replied that I was the one who wrote the User Guide
on that feature. He started to say something, but it was only a
fragment of a word, and it kind of fell on the table and died. There
was a few seconds of silence, while he tried to figure out if he could
call me a liar. I just looked right at him without blinking.

Forget what you have learned in books. Think of a real zoo. Think
how you would write the simplest possible program to do what Animals_2
does -- keep track of all the different classes of animals, and
display the characteristics of any animal or class, including
characteristics that are shared by all animals in a larger grouping.
And, as general rule, you should think carefully before using classes
to store data; that's typically what objects are for. I used static
data in programs quite a lot before I realised that it too-often bit
me later on.

Classes *are* objects. I think you mean instances. I make a
distinction between class variables and instance variables, depending
on whether the variable is different from one instance to another.
Every instance has a different cat.name, but all cats share the genus
"feline". In fact, they share that genus with all other members of
the Feline class. That is why I moved it from Cat to Feline as soon
as our example was big enough to include a Feline class.
*Sigh* No, I must say that doesn't help much. :-\

As I said, there is something wrong with the whole idea behind it; the
design needs refactoring, not individual lines of code.

Having said that, I'll try to redact the issues as best I can, on the
basis that it may illustrate what I mean.

OK: start with the basics. We need iterative counting data about the
individual elements of the heirarchy.

The first thing is that we need to factor out the print statements.
Your back-end data manipulation modules should never have UI elements
in them. So, whatever form the data manipulation comes in, it should
be abstract.

You are adding requirements to what I already have. OK if it doesn't
slow the introductory presentation too much.
Secondly, we want to keep the data stored in each class local to that
class. So, Mammal can store the number of Mammals, if that turns out
to be a good solution, but not the number of it's subclasses. OTOH we
could remove the data from the classes altogether.

Think of a real zoo. If you ask the zookeeper how many animals he
has, will he tell you only the number that are animals, but are not
also lions or tigers or any other species? That number would be zero.

I really do want numMammals to display the total number of all
mammals, whether or not they are a member of some other class in
addition to Mammal.

If I were to guess at your objection to this, I would assume you are
worried that the different counters will get "out-of-sync", if for
example, someone directly changes one of these variables, rather than
calling the appropriate functions to make a synchronized change.

My answer to that is to make the counter variables private. I've
added a leading underscore to those names. numMammals is now
_numMammals.
Thirdly, it would probably be nice if we had the ability to implement
the whole thing in multiple independant systems. Currently the design
only allows one of "whatever-we're-doing" at a time, which is almost
certainly bad.
???

After a bit of brainstorming this is what I came up with. It's not a
specific solution to your problem; instead it's a general one. The
following class may be sub-classed and an entire class-heirarchy can
be placed inside it. It will then generate automatically the code to
keep a track of and count the elements of the class heirarchy,
returning the data you want at a method call.

This is done with a standard OO tool, the Decorator pattern, but
ramped up with the awesome power of the Python class system. :)

My non-CIS students are not familiar with the Decorator pattern. I
fear that will make this example incomprehesible to them.
class Collective:
class base: pass

def startup(self, coll, root):
#wrapper class to count creations of classes
self.root = root
class wrapper:
def __init__(self, name, c):
self.mycount = 0
self.c = c
self.name = name
def __call__(self, *arg):
tmp = self.c(*arg)
self.mycount += 1
return self.c(*arg)
self.wrapper = wrapper
#replace every class derived from root with a wrapper
#plus build a table of the
self.wrap_list = []
for name, elem in coll.__dict__.items():
try:
if issubclass(elem, self.root):
tmp = wrapper(name, elem)
self.__dict__[name] = tmp
self.wrap_list.append(tmp)
except: pass

#when subclassing, override this
#call startup with the class name
#and the root of the class heirarchy
def __init__(self):
self.startup(Collective, self.base)

#here's the stuff to do the counting
#this could be much faster with marginally more work
#exercise for the reader... ;)

def get_counts(self, klass):
counts = [ (x.c, (self.get_sub_count(x), x.name)) \
for x in self.super_classes(klass) ]
counts.append( (klass.c, (self.get_sub_count(klass),
klass.name)) )
counts.sort(lambda x, y: issubclass(x[0], y[0]))
return [x[-1] for x in counts]

def get_sub_count(self, klass):
count = klass.mycount
for sub in self.sub_classes(klass):
count += sub.mycount
return count
def super_classes(self, klass):
return [x for x in self.wrap_list if issubclass(klass.c, x.c)
\
and not x.c is klass.c]
def sub_classes(self, klass):
return [x for x in self.wrap_list if issubclass(x.c, klass.c)
\
and not x.c is klass.c]

So we can now do:

class animal_farm(Collective):
class Animal: pass
class Mammal(Animal): pass
class Bovine(Mammal): pass
class Feline(Mammal): pass
class Cat(Feline): pass
def __init__(self):
self.startup(animal_farm, self.Animal)


a_farm = animal_farm()
cat = a_farm.Cat()
feline = a_farm.Mammal()
print a_farm.get_counts(a_farm.Feline)
[(2, 'Animal'), (2, 'Mammal'), (1, 'Feline')]


The above code is 51 lines with about 10 lines of comments. For a
project of any size, this is a heck of an investment; I believe it
would take a fairly determined idiot to break the system, and *most
importantly*, they would be able to trace back the cause from the
effect fairly easily.

This is an impressive bit of coding, but I can assure you, as an
introduction to OOP, it will blow away any non-CIS student. It may
also be difficult to modify, for example, if we want to do what
Animals_2 does, and provide a custom display of characteristics for
each class.

One possibility is to make this an Animals_3 example. Animals_1 was a
simple two-class structure. It served to introduce instance
variables, and some basic concepts like inheritance. When we moved to
Animals_2, we pointed out the limitations of Animals_1, like not
having enough classes to put variables like 'genus' where they really
belong.

Maybe we should go one more step, and make this a third example. We
can point out the limitations of Animals_2 in the introduction to
Animals_3. I can see the benefit of moving the print statements to
the top level. This is needed if we ever want to make the classes in
Animals_2 work in some kind of framework with other classes. The
show() functions in Animals_2 could be modified to return a list of
strings instead of printing directly to the console.

I've posted your program as Solution 3 to the exercise at
http://ece.arizona.edu/~edatools/Python/Exercises/ Could you give us
a brief description of the advantages and disadvantages compared to
the original. I'm not able to do that, because I'm having difficulty
restating what you have said above in terms that students will
understand. I cannot, for example, explain why your solution is more
robust.
Admittedly the solution is on the complicated side, though perhaps
someone with more experience than me could simplify things.
Unfortunately, a certain amount of complexity is just a reflection of
the fact that your demands strain the OO paradigm right to it's limit.
You could possibly implement the same thing in Java with a Factory
pattern, and perhaps the reflection API.

Your vast experience may be blinding you to the problems non-CIS
students will have with these more complex solutions. I may be
pushing a paradigm to some limit, but these are real-world problems
that should be easily solved with a good OOP language.

-- Dave
 
D

David MacQuigg

Sure; you have to have some understood conventions. In effect that's
all any higher-level programming language than assembler is. IMO, in
the case in point, the explicit self makes OO easier to undersstand;
YMMV, of course, as it does with Dave. :)

I really hate to get involved in this "explicit self" debate, as I
know it has been debated ad-nauseum for many years, but I wish people
would not attribute to me a point-of-view that I do not have. For me
the "explicit self" is *not* the issue. I have a slight preference
for .var over self.var or $var, but this is a matter of personal
preference, not an implicit vs explicit question.

My definition of explicit is that you can tell the meaning of a
statement without reference to the surrounding code. By that
definition all of the above forms are explicit. They are all instance
variables, and there is no other interpretation. The choice between
them is a *minor issue*, and I trust GvR to make these choices.

The real issue is whether we can unify all forms of functions and
methods. This requires we do something different with 'self'. From a
unification standpoint, an equally acceptable solution is that we add
'self' to *all* functions and methods, whether they need it or not.

The problem with rules like "explicit is better than implicit" is that
some people take them too literally. Sometimes it seems like quoting
a rule is a substitute for thinking. We need to balance all these
rules, and add a good measure of common sense to reach the true goal.

My goal is overall simplicity for the syntax needed to solve the kind
of problems that a non-CIS technical professional might encounter.
That includes static methods, but not metaclasses. That could change
if someone convinced me that static methods are unnecessary or that
metaclasses are essential to solve a real-world problem.

Meanwhile I'm happy to just watch the wizards play with metaclasses,
and I don't care if their brains explode. :>) I'm also happy to use
what they develop with metaclasses. Case in point - Michele
Simionato's prototype module. This is easy to use, even if I don't
understand how it works. To me, its like a C module. I don't feel I
need to understand the internals.

I feel entirely differently about excess complexity in syntax my
students and clients are likely to write or need to understand.

-- Dave
 
G

Greg Ewing

David said:
I could give you an example from IC Design, but for the course I
teach, I chose to use a similar hierarchy based on something everyone
would understand - a taxonomy of animals. Nothing in this example is
something you wouldn't find in a real program to model an integrated
circuit.

Perhaps it would be better to give these people examples
based on actual parts of an integrated circuit, then?
If they can see an immediate, realistic use for what
you're teaching, they're likely to grasp and remember
it more easily. Cats and mammals have the potential to
be just as boring and artificial-seeming as parts and
suppliers.
 
G

Greg Ewing

David said:
The real issue is whether we can unify all forms of functions and
methods. This requires we do something different with 'self'. From a
unification standpoint, an equally acceptable solution is that we add
'self' to *all* functions and methods, whether they need it or not.]

I still don't see the benefit of unifying them to this degree.
Even if you added 'self' to every function, there would *still*
be two kinds of function, those that use their 'self' and those
that don't. The difference in usage is still there, even if it's
not directly reflected in the syntax. I can't see the point in
trying to pretend otherwise.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,201
Messages
2,571,048
Members
47,651
Latest member
VeraPiw932

Latest Threads

Top