OO design

C

chris

I've been scripting with python for a while now. Basically writing a few
functions and running in the ipython shell. That's been very useful. But the
more I do this the more I see that I'm doing more or less the same thing
over and over again. So its feels like I need to get into class programming
with all its attendant benefits. However my biggest problem is a conceptual
one. I just can't get my head around defining suitable classes, how they
aquire data and communicate with each other. I'm hoping some of you python
lamas out there might be able to share some of your wisdom on the subject.

What I basically do is a lot of the following::

1. get arbitrary numerical data (typically large data sets in columnar
format or even via COM from other packages. I generally have to deal with
one or more sets of X,Y data)
2. manipulate the data (scaling, least squares fitting, means, peaks,
add/subtract one XY set from another etc)
3. plot data (original set, results of manipulation, scatterplot, histograms
etc - I use matplotlib)
4. export data (print, csv, shelve)

I have no problem writing bits of functional code to do any of the above.
But for the life of me I can't see how I can hook them altogether in an OO
based framework that I can build and extend (with more data formats,
manipulations, GUI etc).

When I think about what I should do I end up with a class XY that has a
method for everything I want to do eg.

class XY:
def read_file
def scale_data
def plot_data
def shelve_data

But somehow that doesn't feel right, especially when I expect the number of
methods will grow and grow, which would make the class very unwieldy.

Even if that was a legitimate option, I don't understand conceptualy how I
would, for example, plot two different XY objects on the same graph or add
them together point by point. How do two different XY objects communicate
and how do you deal with the thing that they must have in common (the plot
screen for example).

Clearly I'm having some conceptualisation problems here. Hope someone can
shed some light on the subject

bwaha.
 
C

Chris Smith

chris> I have no problem writing bits of functional code to do any
chris> of the above. But for the life of me I can't see how I can
chris> hook them altogether in an OO based framework that I can
chris> build and extend (with more data formats, manipulations,
chris> GUI etc).

Chris,
I echo your sentiment.
My little pet project has recently grown to include a bit of an object
hierarchy. I've always felt that the Java-esque byzantine pedigree
for everything last class a trifle over-done.
What I used to trigger factorization was whether or not I needed to
branch based on data type:

if input_data.type == "this":
do_this()
else:
do_that()

meant I should factor my code to:

class input_data_base()
pass

class input_data_this(input_data_base)
pass

class input_data_that(input_data_base)
pass


With this, my project has a modest number of sensible inheritance
hierarchies (two) and the code that wouldn't benefit from such retains
its procedural character.
OO is a great organizer, but every paradigm runs afoul of Sturgeons
Law if over-driven.
HTH,
Chris
 
D

Dave Cook

I've been scripting with python for a while now. Basically writing a few
functions and running in the ipython shell. That's been very useful. But the
more I do this the more I see that I'm doing more or less the same thing
over and over again.

When that happens, it's probably a good sign that you need to create a
module for that functionality.

As for OO in Python, IMO it's best just to dive in and not worry about being
"methodologically correct" at first. Unfortunately, most books on OO use
static languages for their examples, which usually obscures concepts that
are extremely simple in Python. Books using Smalltalk are not too bad,
though, for example _Smalltalk, Objects, and Design_ by Chamond Liu.

Book chapters on Python OO basics:

http://diveintopython.org/object_oriented_framework/index.html
http://www.ibiblio.org/g2swap/byteofpython/read/oops.html
http://www.pasteur.fr/formation/infobio/python/ch18.html

fraca7 series on design patterns, where the moral of the story is often
"You don't need to do that in Python.":

http://fraca7.free.fr/blog/index.php?Python

Dave Cook
 
C

Caleb Hattingh

Chris
1. get arbitrary numerical data (typically large data sets in columnar
format or even via COM from other packages. I generally have to deal with
one or more sets of X,Y data)
2. manipulate the data (scaling, least squares fitting, means, peaks,
add/subtract one XY set from another etc)
3. plot data (original set, results of manipulation, scatterplot,
histograms
etc - I use matplotlib)

Matplotlib is really coming on. I still use gnuplot out of familiarity
(and features, to be sure) but one of these days I'm going to spend a bit
of time on Matplotlib.
4. export data (print, csv, shelve)

I do very much the same kind of work with python. I write mostly in
Delphi at work, but for processing stuff like this, I always use python
when the dataset is not too big and the processing of the data is not too
expensive. Despite the awesome Delphi IDE, python with a text editor is
*still* more productive (for *me*) in jobs like this.
I have no problem writing bits of functional code to do any of the above.
But for the life of me I can't see how I can hook them altogether in an
OO
based framework that I can build and extend (with more data formats,
manipulations, GUI etc).

To be honest, I am probably a poor source of advice here because I think I
tend to overuse the class paradigm when often a more sensible approach
would be a top-down strategy. I just tend to think in terms of objects,
for better or worse. At least python's class declarations are not
expensive in terms of setting up, so I tell myself it's ok.

Lets look at what you suggested:
class XY:
def read_file
def scale_data
def plot_data
def shelve_data

This is exactly the kind of thing I do as well, but maybe separate the
dataset from the processing (and put the classes into separate files, if
you prefer - I find the "one class per file" idea easier to manage in my
editor , Vim)

class XYpoint(object):
def __init__(self,x=0,y=0):
self.x = x
self.y = y

class Dataset(object):
def __init__(self):
self.data = [] # Will be a list of XYpoint objects - probably filled in
with Gather?
def Gather(self,source):
pass # You fill this in, using source as you prefer
def Scale(self,factorX,factorY): # Filled out with example implementation
below
for i in range(len(self.data)):
self.data.x = self.data.x * factorX
self.data.y = self.data.y * factorY
def Plot(self):
pass # Do what you gotta do
def Shelve(self):
pass # Do what you gotta do

class MultipleDatasets(object):
def __init__(self):
self.datasets = [] # Will be a list of dataset objects, which you must
populate
def PlotAll(self):
for i in self.datasets:
i.Plot # How to plot all your datasets, for example

[FWIW - This is how I write all my python programs - very very naively and
simply. I just cannot remember all the fancy things enough to use when I
need to get something done. This is the kind of simple syntax that lured
me to python in the first place, and I have a disconcerting feeling about
all the "advanced" features for "master" programmers creeping into the
language lately, or at least getting discussed - I don't think I am smart
enough to absorb it all - CS isn't my area]
But somehow that doesn't feel right, especially when I expect the number
of
methods will grow and grow, which would make the class very unwieldy.

I would be interested to know if you think what I wrote "feels" right to
you or not - It certainly feels "right" to me, but then that is hardly
surprising. In any case, what I presented almost exactly fits how I
"think" about the problem, and that is what I want.

?

regards
Caleb
 
R

Robert Kern

chris said:
When I think about what I should do I end up with a class XY that has a
method for everything I want to do eg.

class XY:
def read_file
def scale_data
def plot_data
def shelve_data

But somehow that doesn't feel right, especially when I expect the number of
methods will grow and grow, which would make the class very unwieldy.

I think that a key thing to remember, especially as you start learning
OO design, is to not succumb to analysis paralysis. If all you need
right now are those four methods, then implement those four methods. You
don't know what the other 16 methods are going to be, so don't bother
with them yet. When you do end up with the groaning, 20-method behemoth
(and you will), then you will have a better idea of what capabilities
you need and who needs to talk to whom.

At this point, you refactor. Do not fear the refactoring. Embrace it.
There's much to be said for just getting it right the first time, but
let's face it: it never happens because what's "right" is rarely known
at the beginning of a project. Eventually, as you accumulate experience
and find designs that work and ones that don't, you'll begin to have a
better idea of what's "right" ahead of time and your initial designs are
going to get better and better. But for now, don't let the fear of bad
design prevent you from writing code. We all do bad designs; we just fix
them later.

In short slogans: Just Do It. Make It Work, Then Make It Right. Refactor
Mercilessly. Do the Simplest Thing That Could Possibly Work.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
C

corey.coughlin

I also have a little trouble with creating megaclasses. Usually I just
try to think about what things are a little bit, and how I'm going to
be using them. I think somebody else suggested a top down approach,
and that makes a certain amount of sense.

But at this point, you're probably getting tired of all the vague
advice. It sounds like you don't have a very complicated set of
concepts to deal with, so you probably won't wind up with too many
classes, but having a single giant one is probably inconvenient.
Here's how I would do things (although keep in mind, I only have a
vague understanding of what you really need to do..)

First off, you have your (x,y) datasets. It sounds like you perform
most of your operations on those, so they'd probably be a good idea for
a class.

You may also want a class for the (x,y) points themselves, especially
if you have any operations that collapse a point set to a single value
or values (like your means and peaks and stuff.)

So the methods on your (x,y) dataset probably want to be restricted to
your mathematical operations. Although you'll also definitely want to
look at the representation that gets passed onto your other classes.

Thinking beyond those basic classes, you probably want to now think
about your input and output operations. A general way of doing OOP for
things like this is to try to abstract your input and output operations
into objects based on what kinds of inputs or outputs they are. In
this case it sounds like you have files, which are both input and
output, and plots, which are just output. Both seem to take either
single x,y sets or a number of sets. So each I/O object would appear
to be some sort of object container for your x,y sets.

Take the plot object. It will probably be fairly simple, just
something you pass sets of x,y lists to and it plots them. You may
want some more capabilities, like naming the sets, or giving the axes
units, or whatever, but you can probably add that stuff later.

Now for the file object. Luckily, python already includes a file
object, so you already have something to base yours on. For the files,
you have two basic operations, the read and the write. For the read,
you'll need to pass in a filename, and have a way to return the
datasets (like iterating over a data file produces a dataset, hm...).
For output, you'll again need a filename, and a set of datasets to
write. For both operations, you may also want to provide formatting
options, unless you can figure it out from the filename or something.
So the ultimate hierarchy might look something like this:

class point:
self.x
self.y

class dataset:
self.data = [ #list of points ]
self.units = '' #?? maybe
def sum
def mean
# and so on

class datafile:
self.filename
self.filetype
self.datasets = [ #list of data sets? ]
def read
def write
## and so on

class plot:
self.datasets = { #dictionary of data sets, each named?}
self.xaxis = ''
self.yaxis = ''
def makeplot
## and so on

Looking at it like that, it might make some sense to come up with
something like a dataset collection class that you can pass to plot or
datafile, but if you can use something like a list (if the data is
fairly arbitrary) or a dictionary (if the datasets have names or some
other way of signifying what they're about), that will probably be
enough. Anyway, that's just a rough sketch, hopefully it will give
you some ideas if it doesn't seem like the best solution.
 
T

Terry Hancock

So its feels like I need to get into class programming
with all its attendant benefits. However my biggest problem is a conceptual
one. I just can't get my head around defining suitable classes, how they
aquire data and communicate with each other. I'm hoping some of you python
lamas out there might be able to share some of your wisdom on the subject.

Doubtful, Llamas speak Perl.

It's some kind of rodent that does Python. "Womp rats", probably. And Pythons,
of course.
What I basically do is a lot of the following::

1. get arbitrary numerical data (typically large data sets in columnar
format or even via COM from other packages. I generally have to deal with
one or more sets of X,Y data)
2. manipulate the data (scaling, least squares fitting, means, peaks,
add/subtract one XY set from another etc)
3. plot data (original set, results of manipulation, scatterplot, histograms
etc - I use matplotlib)
4. export data (print, csv, shelve)

Well, obviously you have some options, but let's think about everything
in that description that *could* be an object:

* Data set (1)
* Individual row of a data set (1)
* Transformation of a data set (2)
* Plot data (3)
* Plot (3) (probably same as "Plot data"

Chances are that "Individual row of a data set" is a dumb enough object
to be represented by a built-in type (maybe a tuple). "Data set" is obviously
a collection, possibly ordered, so it might subclass "list" (or just be a list).

So far, though, you could just use a "list of tuples". Serialization/deserialization
of the data is one reason for subclassing list, though. Or you could have
an object representing the file or serialized version, which has a method for
returning the data set as a list.

"Transformation" is almost certainly a class you want to define. This would
obviously hold the data required to define the transformation and at least
one method for applying the transformation to a data set. You might even
define math operators on it so you can use operator notation like:

my_transform = Transformation(... initialization values defining the transform ...)
xformed_data_set = my_transform * original_data_set

But of course, you could just use methods:

xformed_data_set = my_transform.apply(original_data_set)

"Plot" is another good defined-class. Initialize it with things like the plotting
window, scaling, etc and what data set it applies to. It's nice because it can
be 1:1 with the actual display widget in your GUI (if you have one). Candidates
for attributes would include the lower and upper X and Y limits, log or linear
scale, etc. Candidates for methods would be displaying the plot, printing
the plot, converting to a string description, etc.
Clearly I'm having some conceptualisation problems here. Hope someone can
shed some light on the subject

I think you are trying to use an object as a module. Better to use a module
for that. ;-)

Meanwhile, meditate on what "object" or "thing" means to you, and how
it might map to programming concepts. What is each "thing" that you can
imagine swapping out in a more sophisticated implementation with lots
of different variations (e.g. data formats, guis, etc). Clearly modularizing
along the lines of interchangeable elements is also a good approach.

HTH,
Terry
 
F

flupke

Robert said:
In short slogans: Just Do It. Make It Work, Then Make It Right. Refactor
Mercilessly. Do the Simplest Thing That Could Possibly Work.

+1 QOTW

Very good advice IMO.
I would like to add that for the simpler classes, thinking of how you
want to use data can be a great starting point.
I recently programmed an interface to a firebird database and said, how
do i want to be able to use the software?
I thought of this:

table = fb.Table("datafile.fdb","customers")
vals = {}
vals["name"]="customer1"
vals["city"]="mytown"
table.insert(vals)

It looked like a great way to access and use it and it hides all the sql
details. Well, that's how i started and i had to refactor along the way
too :)

Regards,
Benedict
 
C

chris

Extremely grateful for all the responses. I've pasted them all into a
document and can now read all your valuable ideas together. Even at a first
reading they have already helped clarify my thinking.

Also minor clarifications::
I'm hoping some of you python
lamas out there might be able to share some of your wisdom on the subject.

lama = guru = teacher (not a furry animal, although my dog has certainly
taught me a few tricks ... like when to take her for a walk, when to play
ball, and when its time for a tummy rub.)

be well and happy always
 
F

Florian Diesch

chris said:
I've been scripting with python for a while now. Basically writing a few
functions and running in the ipython shell. That's been very useful. But the
more I do this the more I see that I'm doing more or less the same thing
over and over again. So its feels like I need to get into class programming
with all its attendant benefits. However my biggest problem is a conceptual
one. I just can't get my head around defining suitable classes, how they
aquire data and communicate with each other. I'm hoping some of you python
lamas out there might be able to share some of your wisdom on the subject.

Just some thoughts about it:
What I basically do is a lot of the following::

1. get arbitrary numerical data (typically large data sets in columnar
format or even via COM from other packages. I generally have to deal with
one or more sets of X,Y data)

You may create a class for each data format that reads that data and
creates a set from it.
2. manipulate the data (scaling, least squares fitting, means, peaks,
add/subtract one XY set from another etc)

This methods may either manipulate youe set's data or create a new set
with the data
3. plot data (original set, results of manipulation, scatterplot, histograms
etc - I use matplotlib)

I never useds matplotlib. Maybe it's usefull to have one or more classes
covering the functions you need.
4. export data (print, csv, shelve)

Again have a class for each output format.

I have no problem writing bits of functional code to do any of the above.
But for the life of me I can't see how I can hook them altogether in an OO
based framework that I can build and extend (with more data formats,
manipulations, GUI etc).

When I think about what I should do I end up with a class XY that has a
method for everything I want to do eg.

class XY:
def read_file
def scale_data
def plot_data
def shelve_data

But somehow that doesn't feel right, especially when I expect the number of
methods will grow and grow, which would make the class very unwieldy.

Even if that was a legitimate option, I don't understand conceptualy how I
would, for example, plot two different XY objects on the same graph or add
them together point by point. How do two different XY objects communicate

Have a look at the IntervalSet module someone announced here some time
before to get some ideas.
and how do you deal with the thing that they must have in common (the plot
screen for example).

Create classes for this things. Then you may either pass XY to method of Thing
or the Thing to a method of XY.


Florian
 
T

Terry Hancock

Also minor clarifications::


lama = guru = teacher (not a furry animal, although my dog has certainly
taught me a few tricks ... like when to take her for a walk, when to play
ball, and when its time for a tummy rub.)

Ah, of course. My mistake. ;-)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top