using "private" parameters as static storage?

J

Joe Strout

One thing I miss as I move from REALbasic to Python is the ability to
have static storage within a method -- i.e. storage that is persistent
between calls, but not visible outside the method. I frequently use
this for such things as caching, or for keeping track of how many
objects a factory function has created, and so on.

Today it occurred to me to use a mutable object as the default value
of a parameter. A simple example:

def spam(_count=[0]):
_count[0] += 1
return "spam " * _count[0]
'spam spam '

This appears to work fine, but it feels a little unclean, having stuff
in the method signature that is only meant for internal use. Naming
the parameter with an underscore "_count" makes me feel a little
better about it. But then, adding something to the module namespace
just for use by one function seems unclean too.

What are your opinions on this idiom? Is there another solution
people generally prefer?

Ooh, for a change I had another thought BEFORE hitting Send rather
than after. Here's another trick:

def spam2():
if not hasattr(spam2,'count'):spam2.count=0
spam2.count += 1
return "spam2 " * spam2.count

This doesn't expose any uncleanliness outside the function at all.
The drawback is that the name of the function has to appear several
times within itself, so if I rename the function, I have to remember
to change those references too. But then, if I renamed a function,
I'd have to change all the callers anyway. So maybe this is better.
What do y'all think?

Best,
- Joe
 
M

Matimus

One thing I miss as I move from REALbasic to Python is the ability to  
have static storage within a method -- i.e. storage that is persistent  
between calls, but not visible outside the method.  I frequently use  
this for such things as caching, or for keeping track of how many  
objects a factory function has created, and so on.

Today it occurred to me to use a mutable object as the default value  
of a parameter.  A simple example:

def spam(_count=[0]):
      _count[0] += 1
      return "spam " * _count[0]

 >>> spam()
'spam '
 >>> spam()
'spam spam '

Don't Do this, it is confusing and there are definitely (many) better
facilities in python for handling saved state.

Ooh, for a change I had another thought BEFORE hitting Send rather  
than after.  Here's another trick:

def spam2():
      if not hasattr(spam2,'count'):spam2.count=0
      spam2.count += 1
      return "spam2 " * spam2.count


This is definitely preferred over the first. However the preferred
method is just to use a class. Preserving state is what classes are
for.
.... def __init__(self):
.... self._count = 0
.... def spam(self):
.... self._count += 1
.... return " ".join("spam" for _ in xrange(self._count))
....spam spam spam

It also gives you the ability to have two compleately separate
instances of the same state machine.

You can use it like a function if you need to for convenience or
backwards compatibility.:
spam spam spam

Or:
.... def __init__(self):
.... self._count = 0
....
.... def spam(self):
.... self._count += 1
.... return " ".join("spam" for _ in xrange(self._count))
....
.... __call__ = spam
....spam spam spam


Matt
 
J

J. Cliff Dyer

One thing I miss as I move from REALbasic to Python is the ability to
have static storage within a method -- i.e. storage that is persistent
between calls, but not visible outside the method. I frequently use
this for such things as caching, or for keeping track of how many
objects a factory function has created, and so on.

Today it occurred to me to use a mutable object as the default value
of a parameter. A simple example:

def spam(_count=[0]):
_count[0] += 1
return "spam " * _count[0]
spam() 'spam '
spam()
'spam spam '

Don't Do this, it is confusing and there are definitely (many) better
facilities in python for handling saved state.

Ooh, for a change I had another thought BEFORE hitting Send rather
than after. Here's another trick:

def spam2():
if not hasattr(spam2,'count'):spam2.count=0
spam2.count += 1
return "spam2 " * spam2.count


This is definitely preferred over the first. However the preferred
method is just to use a class. Preserving state is what classes are
for.

Preserving state is what *objects* are for. Even the builtins have
state to be preserved (list.__len__, func.func_code, for example).

Classes are for creating custom objects.
... def __init__(self):
... self._count = 0
... def spam(self):
... self._count += 1
... return " ".join("spam" for _ in xrange(self._count))
...

Oh of course. This is a much cleaner way to return the response than
the one I used. (FYI, I used: `return ("spam " * self._count).rstrip()`
and I didn't like the rstrip even when I was doing it. Dunno why I
didn't think of it.)

... def __init__(self):
... self._count = 0
...
... def spam(self):
... self._count += 1
... return " ".join("spam" for _ in xrange(self._count))
...
... __call__ = spam
...

Interesting. I hadn't thought of making __call__ a synonym for an
existing method. I think I like that, but I'm not quite sure. There's
something that nags at me about having two ways to do the same thing,
but I like giving the method a more descriptive name than __call__.

Cheers,
Cliff
 
R

rurpy

One thing I miss as I move from REALbasic to Python is the ability to
have static storage within a method -- i.e. storage that is persistent
between calls, but not visible outside the method. I frequently use
this for such things as caching, or for keeping track of how many
objects a factory function has created, and so on.
Today it occurred to me to use a mutable object as the default value
of a parameter. A simple example:
def spam(_count=[0]):
_count[0] += 1
return "spam " * _count[0]
spam()
'spam '
spam()
'spam spam '
Don't Do this, it is confusing and there are definitely (many) better
facilities in python for handling saved state.
This is definitely preferred over the first. However the preferred
method is just to use a class. Preserving state is what classes are
for.

Preserving state is what *objects* are for.

Not exclusively, generators also preserve state.

def _spam():
count = 1
while 1:
yield "spam " * count
count += 1
spam = _spam.next()
 
A

Aaron Brady

One thing I miss as I move from REALbasic to Python is the ability to  
have static storage within a method -- i.e. storage that is persistent  
between calls, but not visible outside the method.  I frequently use  
this for such things as caching, or for keeping track of how many  
objects a factory function has created, and so on.

Today it occurred to me to use a mutable object as the default value  
of a parameter.  A simple example:

def spam(_count=[0]):
      _count[0] += 1
      return "spam " * _count[0]

 >>> spam()
'spam '
 >>> spam()
'spam spam '

This appears to work fine, but it feels a little unclean, having stuff  
in the method signature that is only meant for internal use.  Naming  
the parameter with an underscore "_count" makes me feel a little  
better about it.  But then, adding something to the module namespace  
just for use by one function seems unclean too.

What are your opinions on this idiom?  Is there another solution  
people generally prefer?

Ooh, for a change I had another thought BEFORE hitting Send rather  
than after.  Here's another trick:

def spam2():
      if not hasattr(spam2,'count'):spam2.count=0
      spam2.count += 1
      return "spam2 " * spam2.count

This doesn't expose any uncleanliness outside the function at all.  
The drawback is that the name of the function has to appear several  
times within itself, so if I rename the function, I have to remember  
to change those references too.  But then, if I renamed a function,  
I'd have to change all the callers anyway.  So maybe this is better.  
What do y'all think?

Worse yet, if you define a duplicate object at the same scope with the
same name later, it breaks all your references within the function to
itself.

One way around it, which I like the idea of but I'll be honest, I've
never used, is getting a function a 'self' parameter. You could make
it a dictionary or a blank container object, or just the function
itself.

@self_param
def spam( self ):
self._count[0] += 1 #<--- how to initialize?
return "spam " * self._count[0]

Only problem is, how do you initialize _count?

Perhaps 'self_param' can take some initializers, and just initialize
them off of **kwargs in the construction.

@self_param( _count= [] )
def spam( self ):
self._count[0] += 1
return "spam " * self._count[0]

Looks really pretty (imo), but untested.
 
P

Paul McGuire

One thing I miss as I move from REALbasic to Python is the ability to  
have static storage within a method -- i.e. storage that is persistent  
between calls, but not visible outside the method.  I frequently use  
this for such things as caching, or for keeping track of how many  
objects a factory function has created, and so on.
Today it occurred to me to use a mutable object as the default value  
of a parameter.  A simple example:
def spam(_count=[0]):
      _count[0] += 1
      return "spam " * _count[0]
 >>> spam()
'spam '
 >>> spam()
'spam spam '
This appears to work fine, but it feels a little unclean, having stuff  
in the method signature that is only meant for internal use.  Naming  
the parameter with an underscore "_count" makes me feel a little  
better about it.  But then, adding something to the module namespace  
just for use by one function seems unclean too.
What are your opinions on this idiom?  Is there another solution  
people generally prefer?
Ooh, for a change I had another thought BEFORE hitting Send rather  
than after.  Here's another trick:
def spam2():
      if not hasattr(spam2,'count'):spam2.count=0
      spam2.count += 1
      return "spam2 " * spam2.count
This doesn't expose any uncleanliness outside the function at all.  
The drawback is that the name of the function has to appear several  
times within itself, so if I rename the function, I have to remember  
to change those references too.  But then, if I renamed a function,  
I'd have to change all the callers anyway.  So maybe this is better.  
What do y'all think?

Worse yet, if you define a duplicate object at the same scope with the
same name later, it breaks all your references within the function to
itself.

One way around it, which I like the idea of but I'll be honest, I've
never used, is getting a function a 'self' parameter.  You could make
it a dictionary or a blank container object, or just the function
itself.

@self_param
def spam( self ):
      self._count[0] += 1  #<--- how to initialize?
      return "spam " * self._count[0]

Only problem is, how do you initialize _count?

Perhaps 'self_param' can take some initializers, and just initialize
them off of **kwargs in the construction.

@self_param( _count= [] )
def spam( self ):
      self._count[0] += 1
      return "spam " * self._count[0]

Looks really pretty (imo), but untested.- Hide quoted text -

- Show quoted text -

Initialization does not have to be in the body of the method.
.... spam._count[0] += 1 #<--- how to initialize? see below
.... return "spam " * spam._count[0]
....
spam._count = [2] # just initialize it, and not necessarily to 0
spam() 'spam spam spam '
spam() 'spam spam spam spam '
spam() 'spam spam spam spam spam '

-- Paul
 
A

Arnaud Delobelle

Aaron Brady said:
One way around it, which I like the idea of but I'll be honest, I've
never used, is getting a function a 'self' parameter. You could make
it a dictionary or a blank container object, or just the function
itself.

@self_param
def spam( self ):
self._count[0] += 1 #<--- how to initialize?
return "spam " * self._count[0]

Only problem is, how do you initialize _count?

Perhaps 'self_param' can take some initializers, and just initialize
them off of **kwargs in the construction.

@self_param( _count= [] )
def spam( self ):
self._count[0] += 1
return "spam " * self._count[0]

Looks really pretty (imo), but untested.

Rummaging through my ~/python/junk/ I found the almost exact same:

class NS(object):
def __init__(self, dict):
self.__dict__.update(dict)

def static(**vars):
ns = NS(vars)
def deco(f):
return lambda *args, **kwargs: f(ns, *args, **kwargs)
return deco

@static(ncalls=0, history=[])
def foo(ns, x):
ns.ncalls += 1
ns.history.append(x)
print "Number of calls: %s\nHistory:%s" % (ns.ncalls, ns.history)
Number of calls: 1
History:[3]Number of calls: 2
History:[3, 5]Number of calls: 3
History:[3, 5, 'spam']
 
P

Paul Boddie

One thing I miss as I move from REALbasic to Python is the ability to
have static storage within a method -- i.e. storage that is persistent
between calls, but not visible outside the method. I frequently use
this for such things as caching, or for keeping track of how many
objects a factory function has created, and so on.

Why not use a module global? It isn't hidden, but it is quite a clean
approach. Modifying your example...

spam_count = 0

def spam():
global spam_count
spam_count += 1
return "spam " * spam_count

[...]
This doesn't expose any uncleanliness outside the function at all.

I wouldn't be too worried about that. Although namespaces can become
somewhat crowded with all these extra names, there can be benefits in
exposing such names, too, but if you find this too annoying, it could
be advisable to collect these entries and put them in another
namespace, maybe a class or a module.

Paul
 
A

Aaron Brady

One thing I miss as I move from REALbasic to Python is the ability to  
have static storage within a method -- i.e. storage that is persistent  
between calls, but not visible outside the method.  I frequently use  
this for such things as caching, or for keeping track of how many  
objects a factory function has created, and so on.
Today it occurred to me to use a mutable object as the default value  
of a parameter.  A simple example:
def spam(_count=[0]):
      _count[0] += 1
      return "spam " * _count[0]
 >>> spam()
'spam '
 >>> spam()
'spam spam '
This appears to work fine, but it feels a little unclean, having stuff  
in the method signature that is only meant for internal use.  Naming  
the parameter with an underscore "_count" makes me feel a little  
better about it.  But then, adding something to the module namespace  
just for use by one function seems unclean too.
What are your opinions on this idiom?  Is there another solution  
people generally prefer?
Ooh, for a change I had another thought BEFORE hitting Send rather  
than after.  Here's another trick:
def spam2():
      if not hasattr(spam2,'count'):spam2.count=0
      spam2.count += 1
      return "spam2 " * spam2.count
This doesn't expose any uncleanliness outside the function at all.  
The drawback is that the name of the function has to appear several  
times within itself, so if I rename the function, I have to remember  
to change those references too.  But then, if I renamed a function,  
I'd have to change all the callers anyway.  So maybe this is better..  
What do y'all think?
Worse yet, if you define a duplicate object at the same scope with the
same name later, it breaks all your references within the function to
itself.
One way around it, which I like the idea of but I'll be honest, I've
never used, is getting a function a 'self' parameter.  You could make
it a dictionary or a blank container object, or just the function
itself.
@self_param
def spam( self ):
      self._count[0] += 1  #<--- how to initialize?
      return "spam " * self._count[0]
Only problem is, how do you initialize _count?
Perhaps 'self_param' can take some initializers, and just initialize
them off of **kwargs in the construction.
@self_param( _count= [] )
def spam( self ):
      self._count[0] += 1
      return "spam " * self._count[0]
Looks really pretty (imo), but untested.- Hide quoted text -
- Show quoted text -

Initialization does not have to be in the body of the method.

...       spam._count[0] += 1  #<--- how to initialize? see below
...       return "spam " * spam._count[0]
...>>> spam._count = [2] # just initialize it, and not necessarily to 0
'spam spam spam spam '
'spam spam spam spam spam '


That is actually susceptible to a subtle kind of bug:
.... spam._count[0] += 1
.... return "spam " * spam._count[0]
....
spam._count=[2]
spam() 'spam spam spam '
f= spam
f() 'spam spam spam spam '
spam= 'spam and eggs'
f()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in spam
AttributeError: 'str' object has no attribute '_count'

It would be worse if you assigned 'spam' to another function! Of
course one option is 'just don't do that', which is alright. Adding
the self parameter is just a second option, if you need the function
to change names. Though the decorator is a nice place for the
initialization.
 
S

Steve Holden

Ben said:
This is precisely what classes are for: allowing functionality and
state to exist in a single object.


Bind the state to a name with a single leading underscore (‘_foo’),
which is the convention for “not part of the public interfaceâ€.
Don't look for hard access restrictions, though, because they don't
really exist in Python.
Neither do typed variables, but that's not going to stop Joe ;-)

regards
Steve
 
S

Steven D'Aprano

One thing I miss as I move from REALbasic to Python is the ability to
have static storage within a method -- i.e. storage that is persistent
between calls, but not visible outside the method. I frequently use
this for such things as caching, or for keeping track of how many
objects a factory function has created, and so on.

Today it occurred to me to use a mutable object as the default value of
a parameter. A simple example:

def spam(_count=[0]):
_count[0] += 1
return "spam " * _count[0]

This is a common trick, often used for things like caching. One major
advantage is that you are exposing the cache as an *optional* part of the
interface, which makes testing easier. For example, instead of a test
that looks something like this:


cache = get_access_to_secret_cache() # somehow
modify(cache)
result = function(arg)
restore(cache)
assert something_about(result)

you can simple do this:

result = function(arg, _cache=mycache)
assert something_about(result)


Periodically people complain that Python's mutable default argument
behaviour is a problem, and ask for it to be removed. I agree that it is
a Gotcha that trips up newbies, but it is far to useful to give up, and
simple caching is one such reason.

Ooh, for a change I had another thought BEFORE hitting Send rather than
after. Here's another trick:

def spam2():
if not hasattr(spam2,'count'):spam2.count=0 spam2.count += 1
return "spam2 " * spam2.count

This doesn't expose any uncleanliness outside the function at all. The
drawback is that the name of the function has to appear several times
within itself, so if I rename the function, I have to remember to change
those references too. But then, if I renamed a function, I'd have to
change all the callers anyway. So maybe this is better. What do y'all
think?

I've used this myself, but to me it feels more icky than the semi-private
argument trick above.
 
S

Steven D'Aprano

Why not use a module global? It isn't hidden, but it is quite a clean
approach. Modifying your example...

For some definition of "clean".

http://archive.eiffel.com/doc/manuals/technology/bmarticles/joop/
globals.html

http://weblogs.asp.net/wallen/archive/2003/05/08/6750.aspx

Python globals aren't quite as bad, because they are merely global to a
module and not global to your entire application. Nevertheless, your
example is one of the *worst* usages for globals. See below.

spam_count = 0

def spam():
global spam_count
spam_count += 1
return "spam " * spam_count

[...]
This doesn't expose any uncleanliness outside the function at all.

Nonsense. Any other function in the code can write to spam_count and
cause all sorts of havoc. Any function that needs to temporarily modify
spam_count needs to be careful to wrap it with a save and a restore:

n = spam_count
spam_count = 47
s = spam()
spam_count = n


This is precisely one of the anti-patterns that global variables
encourage, and one of the reasons why globals are rightly considered
harmful. Don't Do This.
 
B

Bruno Desthuilliers

Joe Strout a écrit :
One thing I miss as I move from REALbasic to Python is the ability to
have static storage within a method
s/method/function/

-- i.e. storage that is persistent
between calls, but not visible outside the method. I frequently use
this for such things as caching, or for keeping track of how many
objects a factory function has created, and so on.

Today it occurred to me to use a mutable object as the default value of
a parameter. A simple example:

def spam(_count=[0]):
_count[0] += 1
return "spam " * _count[0]
>'spam spam '

This appears to work fine, but it feels a little unclean, having stuff
in the method signature that is only meant for internal use.

It's indeed a hack. But it's a very common one, and
Naming the
parameter with an underscore "_count"

is also a pretty idiomatic way to warn that it's implementation-only.
makes me feel a little better
about it.
But then, adding something to the module namespace just for
use by one function seems unclean too.
What are your opinions on this idiom? Is there another solution people
generally prefer?

If it's really in a *method*, you can always use a class (or instance,
depending on concrete use case) attribute. Else, if you really insist on
cleanliness, you can define your own callable:

class Spammer(object):
def __init__(self):
self._count = 0
def __call__(self):
self._count += 1
return "spam " * self._count

spam = Spammer()

But this might be a little overkill for most concrete use case.

NB : you'll also have to implement __get__ if you want Spammer instances
to be used as methods.
Ooh, for a change I had another thought BEFORE hitting Send rather than
after. Here's another trick:

def spam2():
if not hasattr(spam2,'count'):spam2.count=0
spam2.count += 1
return "spam2 " * spam2.count

This doesn't expose any uncleanliness outside the function at all. The
drawback is that the name of the function has to appear several times
within itself, so if I rename the function, I have to remember to change
those references too.

There's another drawback:

old_spam2 = spam2

def spam2():
print "yadda yadda", old_spam2()


Remember that Python functions are ordinary objects...
 
B

Bruno Desthuilliers

Matimus a écrit :
This is definitely preferred over the first.

I beg to disagree. This solution stores "count" on *whatever* the name
"spam2" resolves at runtime.

(snip)
 
J

Joe Strout

Aaron Brady said:
One way around it, which I like the idea of but I'll be honest, I've
never used, is getting a function a 'self' parameter. You could make
it a dictionary or a blank container object, or just the function
itself.
...
Rummaging through my ~/python/junk/ I found the almost exact same:

class NS(object):
def __init__(self, dict):
self.__dict__.update(dict)

def static(**vars):
ns = NS(vars)
def deco(f):
return lambda *args, **kwargs: f(ns, *args, **kwargs)
return deco

@static(ncalls=0, history=[])
def foo(ns, x):
ns.ncalls += 1
ns.history.append(x)
print "Number of calls: %s\nHistory:%s" % (ns.ncalls, ns.history)

Thanks, Arnaud (and Aaron), that's really clever. I was thinking this
morning that something like this might be possible: a decorator that
adds the static storage with some standard name. I really like how
you've set this so that the static data is initialized right in the
decorator; that makes the intent very clear and hard to screw up.

My only regret with this one is the need to add "ns" to the parameter
list. That makes it lok like part of the signature, when really (in
intent) it is not. If we have to add something to the parameter list,
we may as well do this:

def foo(x, _ns=NS(ncalls=0, history=[])):
...

and skip the decorator. Then at least it's clear that this parameter
isn't really intended for callers. On the other hand, I guess the
decorator actually changes the signature as seen by the calling code,
which is a good thing. Rearranging the parameter order in the
decorator a bit, we could make this so that the intent is clear to the
reader, as well as enforced by the interpreter.

Neat stuff, thank you!

Best,
- Joe
 
J

Joe Strout

def spam(_count=[0]):
_count[0] += 1
return "spam " * _count[0]

This is a common trick, often used for things like caching. One major
advantage is that you are exposing the cache as an *optional* part
of the
interface, which makes testing easier. For example, instead of a test
that looks something like this:

cache = get_access_to_secret_cache() # somehow
modify(cache)
result = function(arg)
restore(cache)
assert something_about(result)

you can simple do this:

result = function(arg, _cache=mycache)
assert something_about(result)

That's a very good point. I'd been working under the assumption that
any outside mucking with the cache was a Bad Thing, but for testing,
it can be darned helpful. And this is a very safe form of mucking; it
doesn't actually affect the "real" cache at all, but just substitutes
a temporary one.

Thanks also for pointing out that this is a common trick -- after all
the attacks on the very idea last night, I was wondering if I were off
alone in the woods again.
I've used this myself, but to me it feels more icky than the semi-
private
argument trick above.

Thanks for the feedback.

Best,
- Joe
 
A

Arnaud Delobelle

Joe Strout said:
Aaron Brady said:
One way around it, which I like the idea of but I'll be honest, I've
never used, is getting a function a 'self' parameter. You could make
it a dictionary or a blank container object, or just the function
itself.
...
Rummaging through my ~/python/junk/ I found the almost exact same:

class NS(object):
def __init__(self, dict):
self.__dict__.update(dict)

def static(**vars):
ns = NS(vars)
def deco(f):
return lambda *args, **kwargs: f(ns, *args, **kwargs)
return deco

@static(ncalls=0, history=[])
def foo(ns, x):
ns.ncalls += 1
ns.history.append(x)
print "Number of calls: %s\nHistory:%s" % (ns.ncalls, ns.history)

Thanks, Arnaud (and Aaron), that's really clever. I was thinking this
morning that something like this might be possible: a decorator that
adds the static storage with some standard name. I really like how
you've set this so that the static data is initialized right in the
decorator; that makes the intent very clear and hard to screw up.

My only regret with this one is the need to add "ns" to the parameter
list. That makes it lok like part of the signature, when really (in
intent) it is not. If we have to add something to the parameter list,
we may as well do this:

def foo(x, _ns=NS(ncalls=0, history=[])):
...

and skip the decorator. Then at least it's clear that this parameter
isn't really intended for callers. On the other hand, I guess the
decorator actually changes the signature as seen by the calling code,
which is a good thing. Rearranging the parameter order in the
decorator a bit, we could make this so that the intent is clear to the
reader, as well as enforced by the interpreter.

Check the wraps() function in the functools module. I think I wrote
this decorator before it came into the standard library.
 
A

Aaron Brady

def static(**vars):
    ns = NS(vars)
    def deco(f):
        return lambda *args, **kwargs: f(ns, *args, **kwargs)
    return deco

@static(ncalls=0, history=[])
def foo(ns, x):
   ns.ncalls += 1
   ns.history.append(x)
   print "Number of calls: %s\nHistory:%s" % (ns.ncalls, ns.history)

One could even add 'ns' as an attribute of 'f', so that the statics
were visible from the outside: 'foo.ns.ncalls'.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,816
Latest member
SapanaCarpetStudio

Latest Threads

Top