Class or Dictionary?

M

Martin De Kauwe

Hi,

I have a series of parameter values which i need to pass throughout my
code (>100), in C I would use a structure for example. However in
python it is not clear to me if it would be better to use a dictionary
or build a class object? Personally I think accessing the values is
neater (visually) with an object rather than a dictionary, e.g.

x = params['price_of_cats'] * params['price_of_elephants']

vs.

x = params.price_of_cats * params.price_of_elephants

So currently I am building a series of class objects to hold different
parameters and then passing these through my code, e.g.

class EmptyObject:
pass

self.animal_prices = EmptyObject()
self.price_of_cats = 12 or reading a file and populating the object


I would be keen to hear any reasons why this is a bad approach (if it
is, I haven't managed to work this out)? Or perhaps there is a better
one?

thanks

Martin
 
A

Andrea Crotti

Hi,

I have a series of parameter values which i need to pass throughout my
code (>100), in C I would use a structure for example. However in
python it is not clear to me if it would be better to use a dictionary
or build a class object? Personally I think accessing the values is
neater (visually) with an object rather than a dictionary, e.g.

x = params['price_of_cats'] * params['price_of_elephants']

vs.

x = params.price_of_cats * params.price_of_elephants

Visually neater is not really a good parameter to judge.
And in a class you can also overload __getattr__ to get the same
syntax as a dictionary.
But (>100) parameters seems really a lot, are you sure you can't split
in many classes instead?

Classes have also the advantage that you can do many things behind the
scenes, while with plain dictionaries you can't do much...
Where do you take those parameters from and what are they used for?
 
M

Martin De Kauwe

I have a series of parameter values which i need to pass throughout my
code (>100), in C I would use a structure for example. However in
python it is not clear to me if it would be better to use a dictionary
or build a class object? Personally I think accessing the values is
neater (visually) with an object rather than a dictionary, e.g.
x = params['price_of_cats'] * params['price_of_elephants']

x = params.price_of_cats * params.price_of_elephants

Visually neater is not really a good parameter to judge.

Yes and No. Makes the code easier to read and less picky to type out,
in my opinion! But regarding the "no" that is why I posted the
question
And in a class you can also overload __getattr__ to get the same
syntax as a dictionary.
But (>100) parameters seems really a lot, are you sure you can't split
in many classes instead?

i have a number some are smaller, for example switch/control flags.
But the rest can be quite large. I can split them but I don't see the
advantage particularly. Currently by using them (e.g.
params.rate_of_decomp) it clearly distinguishes in the code this was a
model parameter read in from a file. I could create more categories
but it just means more names to remember and change in the code.
Classes have also the advantage that you can do many things behind the
scenes, while with plain dictionaries you can't do much...
Where do you take those parameters from and what are they used for?

I have a big model which calculates a process, but this is made up of
a number of smaller models. I have coded each of these are separate
classes which expect to receive an object e.g. params, or two objects
e.g. params and switches.

thanks
 
B

Bill Felton

From an old-time Smalltalker / object guy, take this for whatever it's worth.
The *primary* reason for going with a class over a dictionary is if there is specific behavior that goes along with these attributes.
If there isn't, if this is "just" an 'object store', there's no reason not to use a dictionary.
After all, it is not too far off the mark to say a class is just a dictionary with it's own behavior set...
As another poster pointed out, 100 (or more) attributes is an oddity, I would call it a 'code smell'. Whereas a dictionary with 100 entries is no big deal at all.
But for me, the big deciding factor comes down to whether or not there is specific behavior associated with this "bundle" of attributes. If yes, class, if no, nothing wrong with dictionary.

cheers,
Bill
 
D

Dan Stromberg

Hi,

I have a series of parameter values which i need to pass throughout my
code (>100), in C I would use a structure for example. However in
python it is not clear to me if it would be better to use a dictionary
or build a class object? Personally I think accessing the values is
neater (visually) with an object rather than a dictionary, e.g.

x = params['price_of_cats'] * params['price_of_elephants']

vs.

x = params.price_of_cats * params.price_of_elephants

So currently I am building a series of class objects to hold different
parameters and then passing these through my code, e.g.

class EmptyObject:
   pass

self.animal_prices = EmptyObject()
self.price_of_cats = 12 or reading a file and populating the object


I would be keen to hear any reasons why this is a bad approach (if it
is, I haven't managed to work this out)? Or perhaps there is a better
one?

I'd use a class rather than a dictionary - because with a class,
pylint (and perhaps PyChecker and pyflakes?) should be able to detect
typos upfront. With a dictionary, typos remain runtime timebombs.
 
T

Terry Reedy

or Module;-?

Hi,

I have a series of parameter values which i need to pass throughout my
code (>100), in C I would use a structure for example. However in
python it is not clear to me if it would be better to use a dictionary
or build a class object? Personally I think accessing the values is
neater (visually) with an object rather than a dictionary, e.g.

Dicts are, of course objects. That said, if all keys are identifier
strings, then attribute access is nicer. A module is essentially a dict
with all identifier keys, accessed as attributes.

mod.a is mod.__dict__['a']

Module are sometimes regarded as singleton data-only classes.
Just put 'import params' in every file that needs to read or write them.
The stdlib has at least a few modules that consist only of constants.
One example is tkinter.constants.
 
E

Ethan Furman

Andrea said:
Hi,

I have a series of parameter values which i need to pass throughout my
code (>100), in C I would use a structure for example. However in
python it is not clear to me if it would be better to use a dictionary
or build a class object? Personally I think accessing the values is
neater (visually) with an object rather than a dictionary, e.g.

x = params['price_of_cats'] * params['price_of_elephants']

vs.

x = params.price_of_cats * params.price_of_elephants

Visually neater is not really a good parameter to judge.

I strongly disagree. Code readability is one of the most important issues.

~Ethan~
 
A

Andrea Crotti

Il giorno 11/feb/2011, alle ore 19.47, Ethan Furman ha scritto:
I strongly disagree. Code readability is one of the most important issues.

Perfectly agree with that, but

obj.name = x
obj.surname = y

obj['name'] = x
obj['surname'] = y

are both quite readable in my opinion.
Other things are more important to evaluate in this case.

I normally always wrap dictionaries because before or later I'll want to add some other
features, but I don't know if that's the case here.
 
J

Jean-Michel Pichavant

Martin said:
Hi,

I have a series of parameter values which i need to pass throughout my
code (>100), in C I would use a structure for example. However in
python it is not clear to me if it would be better to use a dictionary
or build a class object? Personally I think accessing the values is
neater (visually) with an object rather than a dictionary, e.g.

x = params['price_of_cats'] * params['price_of_elephants']

vs.

x = params.price_of_cats * params.price_of_elephants

So currently I am building a series of class objects to hold different
parameters and then passing these through my code, e.g.

class EmptyObject:
pass

self.animal_prices = EmptyObject()
self.price_of_cats = 12 or reading a file and populating the object


I would be keen to hear any reasons why this is a bad approach (if it
is, I haven't managed to work this out)? Or perhaps there is a better
one?

thanks

Martin
Using classes is the best choice.

However, if because there would be too much classes to define so that
you are forced to use your EmptyObject trick, adding attributes to the
inntance dynamically, I'd say that dictionnaries are a more common pattern.

Ideally, you would have something like:

class PriceHolder(object):
@classmethod
def fromFile(cls, filename):
# example of abstract method
pass

class Animals(PriceHolder):
def __init__(self):
self.cat = None
self.elephant = None

class Fruits(PriceHolder):
def __init__(self):
self.banana = None
self.apple = None


Then you would have to write 100 PriceHolder subclasses...

JM
 
M

Martin De Kauwe

Hi,

yes I read a .INI file using ConfigParser, just similar sections (in
my opinion) to make one object which i can then pass to different
classes. E.G.

class Dict2Obj:
""" Turn a dictionary into an object.

The only purpose of this is that I think it is neater to reference
values
x.soiltemp rather than x['soiltemp'] ;P
"""
def __init__(self, dict):

for k, v in dict.items():
setattr(self, k, v)


class Parser(object):
""" Load up all the initialisation data.

Return the met data as an array, perhaps we should really also
return a
header as well? Return the ini file as various objects.

"""
def __init__(self, ini_fname, met_fname):

self.ini_fname = ini_fname
self.met_fname = met_fname

def load_files(self):

# read the driving data in, this may get more complicated?
met_data = met_data = np.loadtxt(self.met_fname, comments='#')

# read the configuration file into a different dicts, break up
# into model_params, control and state
config = ConfigParser.RawConfigParser()
config.read(self.ini_fname)

# params
model_params = {}
sections =
['water','nitra','soilp','carba','litter','envir','prodn']
for section in sections:
for option in config.options(section):
model_params[option] = config.getfloat(section,
option)

# turn dict into an object
model_params = Dict2Obj(model_params)

# control
control = {}
for option in config.options('control'):
control[option] = config.getint('control', option)

# turn dict into an object
control = Dict2Obj(control)

initial_state = {}
sections = ['cinit','ninit','vinit']
for section in sections:
for option in config.options(section):
initial_state[option] = config.getfloat(section,
option)

# turn dict into objects
initial_state = Dict2Obj(initial_state)

return (initial_state, control, model_params,
met_data)

So I would then pass these objects through my code, for example a
given class would just expect to inherit params perhaps.

class calc_weight(object):

def __init__(self, params):
self.params = params

There are also "states" that evolve through the code which various
methods of a given class change. For example lets call it elephants
weight. So the model will do various things to predict changes in our
elephants weight, so there will be a class to calculate his/her food
intake etc. Using this example the other thing I wanted to do was pass
"states" around like self.elephant_weight. However it occurred to me
if I just used self.elephant_weight for example it would make it
harder to call individual components seperately, so better to stick to
the same method and using an object i reasoned. Hence I started with
my emptydict thing, I hope that helps make it clearer?

class EmptyObject:
pass


# generate some objects to store pools and fluxes...
self.pools = EmptyObject()

# add stuff to it
self.pools.elephant_weight = 12.0

thanks for all of the suggestions it is very insightful
 
S

Steven D'Aprano

I'd use a class rather than a dictionary - because with a class, pylint
(and perhaps PyChecker and pyflakes?) should be able to detect typos
upfront.

*Some* typos. Certainly not all.

The more complex the code -- and with 100 or so parameters, this sounds
pretty damn complex -- there is a non-negligible risk of mistakenly using
the wrong name. Unless pylint now has a "do what I mean, not what I say"
mode, it can't save you from typos like this:

params.employerID = 23
params.employeeID = 42
# much later
if params.employeeID == 23:
# Oops, I meant employ*er*ID
...


With a dictionary, typos remain runtime timebombs.

Are your unit tests broken? You should fix that and not rely on just
pylint to detect bugs. Unit tests will help protect you against many more
bugs than just typos.

Besides, and I quote...


"I find it amusing when novice programmers believe their main job is
preventing programs from crashing. ... More experienced programmers
realize that correct code is great, code that crashes could use
improvement, but incorrect code that doesn't crash is a horrible
nightmare." -- Chris Smith
 
M

Martin De Kauwe

Sorry I should have added a little more example to help with clarity?
So after reading the .INI file I then initialise the objects I
described e.g.

def initialise_simulation(self):
"""Set the initial conditions.

using values from the .ini value set the C and N pools
and other misc states.

"""
for attr, value in self.initial_state.__dict__.iteritems():
#print attr, value
setattr(self.pools, attr, value)

# maybe need M2_AS_HA here?
self.pools.lai = self.params.sla * self.params.cfrac_dry_mass
self.fluxes.nuptake = 0.0
 
M

Martin De Kauwe

*Some* typos. Certainly not all.

The more complex the code -- and with 100 or so parameters, this sounds
pretty damn complex -- there is a non-negligible risk of mistakenly using
the wrong name. Unless pylint now has a "do what I mean, not what I say"
mode, it can't save you from typos like this:

params.employerID = 23
params.employeeID = 42
# much later
if params.employeeID == 23:
    # Oops, I meant employ*er*ID
    ...


Are your unit tests broken? You should fix that and not rely on just
pylint to detect bugs. Unit tests will help protect you against many more
bugs than just typos.

Besides, and I quote...

"I find it amusing when novice programmers believe their main job is
preventing programs from crashing. ... More experienced programmers
realize that correct code is great, code that crashes could use
improvement, but incorrect code that doesn't crash is a horrible
nightmare." -- Chris Smith

The 100+ parameters just means "everything" can be adjusted outside of
the code, invariable most of it isn't. I am setting this up with some
bayesian calibration in mind. A lot of those parameters will be
switches as well. I think perhaps I will write a defaults module and
just read a .INI file with the values the user wants changing.
Originally I was just going to read everything in, but maybe this is
better from a usage point of view.

As I said I am happy to consider a dictionary, although some of the
calculations are quite complex and I *think* it is easier to read this
way, rather than with a dictionary. That is purely an opinion, I don't
have a computer science background and so I am asking really, is what
I am doing very bad, and if so why? What do other people do? I have
tried to search a lot on this subject but I find the example not very
specific, (or I am reading the wrong places).

thanks.
 
S

Steven D'Aprano

The 100+ parameters just means "everything" can be adjusted outside of
the code, invariable most of it isn't.

If people aren't going to change all these parameters, why are you
spending the time and energy writing code for something that won't happen?

There are *real costs* to increasing the amount of customization people
can do with your code. The more complex your code, the more bugs it will
contain. You should consider this almost a law of nature.

The more power you give people to change parameters, the more often they
will hit unexpected interactions between parameters that you never
imagined. You should expect bug reports that will take forever to track
down, because (for example) the bug only occurs when param #37 is larger
than param #45, but only if param #62 is less than param #5 and both are
less than 100 and param #83 is a multiple of 17...

The more complex your code, the more tests you will need to be sure the
code does what you expect. The number of combinations that need testing
*explodes* exponentially. You simply cannot test every combination of
parameters. Untested code should be considered buggy until proven
otherwise.

Think very carefully before allowing over-customization. Take into
account the costs as well as the benefits. Then, if you still decide to
allow all that customization, at least you know that the pain will be
worth it.

As I said I am happy to consider a dictionary, although some of the
calculations are quite complex and I *think* it is easier to read this
way, rather than with a dictionary.

I believe that most of these calculations should be written as functions
which take arguments. This will allow better testing and customization.
Instead of writing complex calculations inline, you should make them
functions or methods. There's little difference in complexity between:

my_calculation(arg1, arg2, arg3, params.x, params.y, params.z)

and

my_calculation(arg1, arg2, arg3, params['x'], params['y'], params['z'])

(although, as you have realised, the second is a little more visually
noisy -- but not that much).

Better still, have the function use sensible default values:

def my_calculation(arg1, arg2, arg3, x=None, y=None, z=None):
if x is None:
x = params.x # or params['x'], who cares?
if y is None:
y = params['y'] # or params.x
...


This gives you the best of both worlds: you can supply parameters at
runtime, for testing or customization, but if you don't, the function
will use sensible defaults.

That is purely an opinion, I don't
have a computer science background and so I am asking really, is what I
am doing very bad, and if so why? What do other people do? I have tried
to search a lot on this subject but I find the example not very
specific, (or I am reading the wrong places).

Whether you use obj.x or obj['x'] is the least important part of the
problem. That's merely syntax.
 
D

Dan Stromberg

*Some* typos. Certainly not all.

The salient bit is not that all typos are (or are not) caught, but
that significantly more typos are caught before they cause a blowup in
production.
The more complex the code -- and with 100 or so parameters, this sounds
pretty damn complex -- there is a non-negligible risk of mistakenly using
the wrong name. Unless pylint now has a "do what I mean, not what I say"
mode, it can't save you from typos like this:

params.employerID = 23
params.employeeID = 42
# much later
if params.employeeID == 23:
   # Oops, I meant employ*er*ID
   ...

No, and yes, and no.

1) No: Pylint can frequently catch variables that are used before they
are set or set but never used
2) Yes: If you write the wrong variable, and it is read somewhere
afterward, you've still got a typo that's essentially a logic bug in
how it manifests
3) No: You should not use variable names that differ by only a single
character, or are too easy to "thinko" for each other - EG it's better
to use params.companyID, params.supervisorID and params.workerID;
these are clearly distinct.
Are your unit tests broken? You should fix that and not rely on just
pylint to detect bugs. Unit tests will help protect you against many more
bugs than just typos.

Unit tests do not replace static analysis; they are complementary techniques.

I started out in the manifestly-typed world. Then I moved into mostly
python and bash, without any sort of static analysis, and loved it.
More recently, I've been using python with static analysis, and I've
been wondering why I lived without for so long. That doesn't imply a
rejection of duck typing - it merely means I don't want my code
embarrassing me unnecessarily (though I confess, I find Haskell and
the ML family very interesting for their static, _inferred_ typing).

It's a rare program of much size that can be tested so thoroughly that
static analysis is 100% obviated - especially in the reporting of
error conditions. An accidental traceback when you're trying to
output an important detail related to one of a few thousand error
conditions is embarrassing, and frequently could be prevented with
pylint.

Also, pylint tends to run much faster than the large battery of
automated tests I prefer to set up. IOW, it can quickly catch many
kinds of trivial blunders before incurring the development-time
expense of running the full test suite.
Besides, and I quote...

"I find it amusing when novice programmers believe their main job is
preventing programs from crashing. ... More experienced programmers
realize that correct code is great, code that crashes could use
improvement, but incorrect code that doesn't crash is a horrible
nightmare." -- Chris Smith

LOL. Irrespective of some crashes being beneficial, the fact remains
that easily avoided, nonessential crashes should be avoided. Pylint
enables this to be done for an important class of errors. To
eliminate them quickly and painlessly isn't precisely an attribute of
a novice.

Did you have some sort of bad experience with pylint? Do you resent
the 20 minutes it takes to set it up?
 
S

Steven D'Aprano

Did you have some sort of bad experience with pylint? Do you resent the
20 minutes it takes to set it up?

If you read my post more carefully and less defensively, you'll see that
nothing I said was *opposed* to the use of pylint, merely that pylint is
not a substitute of a good, thorough test suite.
 
M

Martin De Kauwe

Steven,

You make some good points and I don't disagree with you. The code is
just a transfer from some older c++ code which had all sorts of
windows menus etc, which isn't required and would make it impossible
to run (again and again). It is not an especially complicated model,
and no doubt there are bugs (though has been used widely for 20
years).

Regarding customisation, in this case it is not a big deal. The way
the code was (arguably poorly) written everything was declared
globally. So in fact by using the .INI file to hold all of them I am
probably saving myself a lot of time and I get away from globals. Most
of those parameters are things I won't change, so I take your point
but I see no issue in having the option frankly (and it is done so
there is no effort now!).

Ordinarily I might do what you said and having method/func calls with
all the params like that, but I felt it was making the code look very
untidy (i.e >5 vars with each call and needing to return > 5 vars). I
then decided it would be better to pass a (single) dictionary or class
object (not sure of correct word here) and I think by referencing it,
e.g. params.light_ext_coeff it is as clear as passing all the values
individually (i.e. you can track where the values came from). I am
happy with this at any rate. Moreover it is what I am used to doing in
C, where I would pass a single structure containing the necessary
values. I also think this is quite standard? I am sure to interface
with scientific libraries (such as GNU GSL) you pass structures in
this way, I was just adopting this method.

The point of this posting was just to ask those that know, whether it
was a bad idea to use the class object in the way I had or was that
OK? And if I should have just used a dictionary, why?

Thanks.
 
A

Andrea Crotti

Il giorno 12/feb/2011, alle ore 00.45, Martin De Kauwe ha scritto:
Hi,

yes I read a .INI file using ConfigParser, just similar sections (in
my opinion) to make one object which i can then pass to different
classes. E.G.

Ok then I suggest configobj, less buggy and much more powerful than ConfigParser:
http://www.voidspace.org.uk/python/configobj.html

(and included from python 2.7).
In this way you can also simply just carry around that dictionary, and it will be correctly
typed if you validate the input.
 
J

John Nagle

Hi,

I have a series of parameter values which i need to pass throughout my
code (>100), in C I would use a structure for example. However in
python it is not clear to me if it would be better to use a dictionary
or build a class object? Personally I think accessing the values is
neater (visually) with an object rather than a dictionary, e.g.

x = params['price_of_cats'] * params['price_of_elephants']

vs.

x = params.price_of_cats * params.price_of_elephants

Don't use a class as a dictionary. It causes various forms
of grief. A dictionary will accept any string as a key, and
has no reserved values. That's not true of class attributes.
There are already many names in a class's namespace, including
any functions of the class. Attribute syntax is restricted -
there are some characters you can't use. Unicode attributes
don't work right prior to Python 3. If the names are coming
in from outside the program, there's a potential security
hole if someone can inject names beginning with "__" and
mess with the internal data structures of the class.
And, of course, you can't use a name that's a reserved
word in Python.

(This last is forever causing grief in programs that
parse HTML and try to use Python class attributes to
represent HTML attributes. "class" is a common HTML
attribute but a reserved word in Python. So such parsers
have to have a special case for reserved words. Ugly.)

In Javascript, classes are simply dictionaries, but Python
is not Javascript. If you want a dictionary, use a "dict".

John Nagle
 
M

Martin De Kauwe

I have a series of parameter values which i need to pass throughout my
code (>100), in C I would use a structure for example. However in
python it is not clear to me if it would be better to use a dictionary
or build a class object? Personally I think accessing the values is
neater (visually) with an object rather than a dictionary, e.g.
x = params['price_of_cats'] * params['price_of_elephants']

x = params.price_of_cats * params.price_of_elephants

    Don't use a class as a dictionary.  It causes various forms
of grief.  A dictionary will accept any string as a key, and
has no reserved values.   That's not true of class attributes.
There are already many names in a class's namespace, including
any functions of the class.  Attribute syntax is restricted -
there are some characters you can't use.  Unicode attributes
don't work right prior to Python 3.  If the names are coming
in from outside the program, there's a potential security
hole if someone can inject names beginning with "__" and
mess with the internal data structures of the class.
And, of course, you can't use a name that's a reserved
word in Python.

    (This last is forever causing grief in programs that
parse HTML and try to use Python class attributes to
represent HTML attributes.  "class" is a common HTML
attribute but a reserved word in Python.  So such parsers
have to have a special case for reserved words.  Ugly.)

    In Javascript, classes are simply dictionaries, but Python
is not Javascript.  If you want a dictionary, use a "dict".

                                John Nagle

OK this was what I was after, thanks. I shall rewrite as dictionaries
then given what you have said.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top