self modifying code

R

Robin Becker

When young I was warned repeatedly by more knowledgeable folk that self
modifying code was dangerous.

Is the following idiom dangerous or unpythonic?

def func(a):
global func, data
data = somethingcomplexandcostly()
def func(a):
return simple(data,a)
return func(a)

It could be replaced by

data = somethingcomplexandcostly()
def func(a):
return simple(data,a)

but this always calculates data.
 
N

nikie

Robin said:
When young I was warned repeatedly by more knowledgeable folk that self
modifying code was dangerous.

Is the following idiom dangerous or unpythonic?

def func(a):
global func, data
data = somethingcomplexandcostly()
def func(a):
return simple(data,a)
return func(a)

It took me quite a while to figure out how it works, so, yes, I'd say
it's unpythonic ;-). It's not really dangerous, but it can get nasty if
someone tries to rename the function, or put it into a class.

But that's probably not the kind of "self-modifying code" you've been
warned against anyway: I've only ever seen self-modifying code in
assembly language or in lisp, the idea is that you really change the
code (i.e. single opcodes in the function that's currently running), so
you can e.g. make an infinite loop, and eventually overwrite the loop
statement to do something else so the loop ends. I'm not sure if you
can do the same thing in Python, maybe by changing the bytecode of a
running function.
It could be replaced by

data = somethingcomplexandcostly()
def func(a):
return simple(data,a)

but this always calculates data.

You could of course initialize data with None and calculate it only on
demand. Or you could use:
http://www.phyast.pitt.edu/~micheles/python/documentation.html#memoize
This has the advantage of encapsulating the memoization logic so it can
be tested (and understood) separately from your code.
 
P

Peter Otten

Robin said:
When young I was warned repeatedly by more knowledgeable folk that self
modifying code was dangerous.

Is the following idiom dangerous or unpythonic?

def func(a):
global func, data
data = somethingcomplexandcostly()
def func(a):
return simple(data,a)
return func(a)

It could be replaced by

data = somethingcomplexandcostly()
def func(a):
return simple(data,a)

but this always calculates data.

Consider

data = None
def func(a):
global data
if data is None:
data = costly()
return simple(data, a)

if you want lazy evaluation. Not only is it easier to understand,
it also works with

from lazymodule import func

at the cost of just one object identity test whereas your func()
implementation will do the heavy-lifting every time func() is called in the
client (unless func() has by chance been invoked as lazymodule.func()
before the import).

Peter
 
R

Robin Becker

Peter Otten wrote:
.....
.........

at the cost of just one object identity test whereas your func()
implementation will do the heavy-lifting every time func() is called in the
client (unless func() has by chance been invoked as lazymodule.func()
before the import).

in the example code the heavy lifting, costly(), is done only once as
the function that does it is overwritten. As pointed out it won't work
as simply in a class. Memoisation could improve the performance of the
normal case form of the function ie

def func(a):
return simple(data,a)

but I guess that would depend on whether simple(data,a) is relatively
expensive compared to the costs the memo lookup.
 
J

John J. Lee

Robin Becker said:
When young I was warned repeatedly by more knowledgeable folk that self
modifying code was dangerous.

Is the following idiom dangerous or unpythonic?

def func(a):
global func, data
data = somethingcomplexandcostly()
def func(a):
return simple(data,a)
return func(a)

1. I don't think most people would call that "self-modifying code". I
won't try defining that term precisely because I know you'll just
pick holes in my definition ;-)

2. The use of global func is just plain weird :)

3. Peter Otten's version is OK, but how about this, using a closure
instead of globals (UNTESTED)

def make_func():
namespace = object()
namespace.data = None
def func(a):
if namespace.data is None:
namespace.data = somethingcomplexandcostly()
return simple(namespace.data, a)
return func
func = make_func()


John
 
R

Robin Becker

John said:
1. I don't think most people would call that "self-modifying code". I
won't try defining that term precisely because I know you'll just
pick holes in my definition ;-)


Don't really disagree about the rewriting code, but the function does
re-define itself.

2. The use of global func is just plain weird :)

3. Peter Otten's version is OK, but how about this, using a closure
instead of globals (UNTESTED)

def make_func():
namespace = object()
namespace.data = None
def func(a):
if namespace.data is None:
namespace.data = somethingcomplexandcostly()
return simple(namespace.data, a)
return func
func = make_func()
........
the inner function is almost precisely what I started with, except I
used the global namespace. However, it keeps the test in side the
function which costs about 1%.
 
B

Ben C

When young I was warned repeatedly by more knowledgeable folk that self
modifying code was dangerous.

Is the following idiom dangerous or unpythonic?

def func(a):
global func, data
data = somethingcomplexandcostly()
def func(a):
return simple(data,a)
return func(a)

It looks quite clever (a bit too clever ... :)
It could be replaced by

data = somethingcomplexandcostly()
def func(a):
return simple(data,a)

but this always calculates data.

Why not just:

data = None
def func(a):
global data

if not data:
data = somethingcomplexandcostly()

return simple(data, a)

Or nicer to use a "singleton" perhaps than a global, perhaps something
like this:

class Func(object):
exists = False

def __init__(self):
assert not Func.exists
Func.exists = True

self.data = None

def simple(self, a):
assert self.data is not None
# ... do something with self.data presumably
return something

def __call__(self, a):
if self.data is None:
self.data = somethingcomplexandcostly()
return self.simple(a)

func = Func()

func(a)
 
S

Steven Bethard

John said:
1. I don't think most people would call that "self-modifying code". I
won't try defining that term precisely because I know you'll just
pick holes in my definition ;-)

2. The use of global func is just plain weird :)

3. Peter Otten's version is OK, but how about this, using a closure
instead of globals (UNTESTED)

def make_func():
namespace = object()
namespace.data = None
def func(a):
if namespace.data is None:
namespace.data = somethingcomplexandcostly()
return simple(namespace.data, a)
return func
func = make_func()

Unfortunately, this doesn't work because you can add attributes to plain
object instances:
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
AttributeError: 'object' object has no attribute 'data'

Maybe you want something like:
.... def func(a):
.... if func.data is None:
.... func.data = somethingcomplexandcostly()
.... return simple(func.data, a)
.... func.data = None
.... return func
........ print 'executing somethingcomplexandcostly'
.... return 42
........ return data, a
....executing somethingcomplexandcostly
(42, 1)(42, 2)

STeVe
 
R

Robin Becker

Ben C wrote:
........
Why not just:

data = None
def func(a):
global data

if not data:
data = somethingcomplexandcostly()

return simple(data, a)

well in the original instance the above reduced to something like

data=None
def func(arg):
global data
if data:
data = ......
return ''.join(map(data.__getitem__,arg))

so the actual function is pretty low cost, but the extra cost of the
test is actually not very significant, but if the actual function had
been cheaper eg

def func(arg):
global data
if data is None:
data = ....
return data+arg

then the test is a few percent of the total cost; why keep it?

All the other more complex solutions involving namespaces, singletons
etc seem to add even more overhead.
 
T

taleinat

First of all, the test can be optimized by adding a boolean flag which
indicates if the data has been initialized or not, and then just
testing a boolean value instead of doing an "is" comparison, at the
cost of an extra global variable. But this is still ugly (actually
uglier IMO).


I think this is a more Pythonic way to do it.
This class implements a function which initializes itself upon the
first call:

class InitializingFunction(object):
def __init__(self, init):
def initializer(*args, **kw):
self.func = init()
return self(*args, **kw)
self.func = initializer
def __call__(self, *args, **kw):
return self.func(*args, **kw)

Now you can write your function almost exactly like you did before:

def init():
data = somethingcomplexandcostly()
def foo(a):
return simple(data, a)
return foo
func = InitializingFunction(init)

What have we gained from this? Two major advantages:
* no ugly 'global' statement
* no reliance on the function's name

And now you can easily create such functions forever using this class
to abstract away the ugly implementation ;)


Notice that since Function Decorators were introduced in Python2.4, you
can use InitializingFunction as a Decorator to achieve the same effect,
this time even without the need for a temporary name for a function:

@InitializingFunction
def func():
data = somethingcomplexandcostly()
def foo(a):
return simple(data, a)
return foo


And finally I must note that no matter which way you turn this around,
it will still be hard to read!
 
R

Robin Becker

What have we gained from this? Two major advantages:
* no ugly 'global' statement
* no reliance on the function's name

I don't dispute either of the above, however, the actual overhead of
your approach appears to be much higher (see below) probably because it
has two function calls instead on one to get the answer.



And now you can easily create such functions forever using this class
to abstract away the ugly implementation ;)
........ yes indeed

######file dingo.py
class InitializingFunction(object):
def __init__(self, init):
def initializer(*args, **kw):
self.func = init()
return self(*args, **kw)
self.func = initializer
def __call__(self, *args, **kw):
return self.func(*args, **kw)

def init():
data = 42
def foo(arg):
return arg+data
return foo
a = InitializingFunction(init)

def b(arg):
global b
data = 42
def b(arg):
return arg+data
return b(arg)
######

Testing with timeit
C:\Tmp>\Python\lib\timeit.py -s"from dingo import a;a(0)" a(1)
100000 loops, best of 3: 2.25 usec per loop

C:\Tmp>\Python\lib\timeit.py -s"from dingo import b;b(0)" b(1)
1000000 loops, best of 3: 0.52 usec per loop

so since the simple function is fairly trivial the overhead seems to be
around 4 times that of the weird approach.

The global naming stuff is pretty flaky and relies on the way names are
looked up; in particular it seems as though references to the original
global will be held at least throughout a single statement. If the first
call is "print b(0),b(1)" then b is initialised twice.

This 'masterpiece of obfuscation' ;) gets round that problem, but is
pretty odd to say the least and still relies on knowing the class name.

class Weird(object):
@staticmethod
def __call__(arg):
data = 42
def func(arg):
return arg+data
Weird.__call__ = staticmethod(func)
return func(arg)
c = Weird()

it is still more expensive than b, but not by much

C:\Tmp>\Python\lib\timeit.py -s"from dingo import c;c(1)" c(1)
1000000 loops, best of 3: 0.709 usec per loop
 
T

taleinat

Yes, my implementation was less efficient because of the extra function
call.
class Weird(object):
@staticmethod
def __call__(arg):
data = 42
def func(arg):
return arg+data
Weird.__call__ = staticmethod(func)
return func(arg)
c = Weird()

Ugh... you've used a class just like a function. You can't have two
different objects of this class, since you are overriding a static
method of the class! And you've hard-coded the data into the class
definition. Yes, it works, but I would never, ever trust such code to
someone else to maintain.

And you'll have to manually define such a class for every such
function. That's not very Pythonic.

Here's a reusable function that will define such a class for you, as
well as hide most of the ugliness (in fact, it supports -exactly- the
same interface as my previous implementation):

def InitializingFunction(func):
class temp:
@staticmethod
def __call__(*args, **kw):
temp.__call__ = staticmethod(func())
return temp.__call__(*args, **kw)
return temp()

@InitializingFunction
def func():
data = somethingcomplexandcostly()
def foo(a):
return simple(data, a)
return foo
 
R

Robin Becker

Yes, my implementation was less efficient because of the extra function
call.

ugh indeed
Ugh... you've used a class just like a function. You can't have two
different objects of this class, since you are overriding a static
method of the class! And you've hard-coded the data into the class
definition. Yes, it works, but I would never, ever trust such code to
someone else to maintain.

And you'll have to manually define such a class for every such
function. That's not very Pythonic.

no arguments here

Here's a reusable function that will define such a class for you, as
well as hide most of the ugliness (in fact, it supports -exactly- the
same interface as my previous implementation):

def InitializingFunction(func):
class temp:
@staticmethod
def __call__(*args, **kw):
temp.__call__ = staticmethod(func())
return temp.__call__(*args, **kw)
return temp()

@InitializingFunction
def func():
data = somethingcomplexandcostly()
def foo(a):
return simple(data, a)
return foo

I already tried this kind of factory function, but in fact I think the
original global test version outperforms even the statically coded Weird
class.

ie
data = None
def d(arg):
global data
if data is None:
data = 42
return arg+data

@InitializingFunction
def e():
data = 43
def foo(a):
return data+a
return foo

are both better than any except the global function replacement nonsense

C:\Tmp>\Python\lib\timeit.py -s"from dingo import d;d(0)" d(1)
1000000 loops, best of 3: 0.556 usec per loop

C:\Tmp>\Python\lib\timeit.py -s"from dingo import e;e(0)" e(1)
1000000 loops, best of 3: 1.09 usec per loop

but the structured approach is still twice as slow as the simplistic one :(
 
T

taleinat

Personally, I would almost always pay the x2 efficiency price in order
to use a class. But then I don't know what you're writing.

Of course, if you really need it to be efficient, you can write it as a
C extension, or use Pyrex, etc. and get -much- better results.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,294
Messages
2,571,511
Members
48,203
Latest member
LillianaFr

Latest Threads

Top