puzzled by name binding in local function

  • Thread starter Ulrich Eckhardt
  • Start date
U

Ulrich Eckhardt

Hello Pythonistas!

Below you will find example code distilled from a set of unit tests,
usable with Python 2 or 3. I'm using a loop over a list of parameters to
generate tests with different permutations of parameters. Instead of
calling util() with values 0-4 as I would expect, each call uses the
same parameter 4. What I found out is that the name 'i' is resolved when
Foo.test_1 is called and not substituted inside the for-loop, which
finds the global 'i' left over from the loop. A simple "del i" after the
loop proved this and gave me an according error.

Now, I'm still not sure how to best solve this problem:
* Spell out all permutations is a no-go.
* Testing the different iterations inside a single test, is
inconvenient because I want to know which permutation exactly fails and
which others don't. Further, I want to be able to run just that one
because the tests take time.
* Further, I could generate local test() functions using the current
value of 'i' as default for a parameter, which is then used in the call
to self.util(), but that code is just as non-obviously-to-me correct as
the current code is non-obviously-to-me wrong. I'd prefer something more
stable.


Any other suggestions?

Thank you!

Uli


# example code
from __future__ import print_function
import unittest

class Foo(unittest.TestCase):
def util(self, param):
print('util({}, {})'.format(self, param))

for i in range(5):
def test(self):
self.util(param=i)
setattr(Foo, 'test_{}'.format(i), test)

unittest.main()
 
D

Dave Angel

Hello Pythonistas!

Below you will find example code distilled from a set of unit tests,
usable with Python 2 or 3. I'm using a loop over a list of parameters to
generate tests with different permutations of parameters. Instead of
calling util() with values 0-4 as I would expect, each call uses the
same parameter 4. What I found out is that the name 'i' is resolved when
Foo.test_1 is called and not substituted inside the for-loop, which
finds the global 'i' left over from the loop. A simple "del i" after the
loop proved this and gave me an according error.

Now, I'm still not sure how to best solve this problem:
* Spell out all permutations is a no-go.
* Testing the different iterations inside a single test, is
inconvenient because I want to know which permutation exactly fails and
which others don't. Further, I want to be able to run just that one
because the tests take time.
* Further, I could generate local test() functions using the current
value of 'i' as default for a parameter, which is then used in the call
to self.util(), but that code is just as non-obviously-to-me correct as
the current code is non-obviously-to-me wrong. I'd prefer something more
stable.


Any other suggestions?

Thank you!

Uli


# example code
from __future__ import print_function
import unittest

class Foo(unittest.TestCase):
def util(self, param):
print('util({}, {})'.format(self, param))

for i in range(5):
def test(self):
self.util(param=i)
setattr(Foo, 'test_{}'.format(i), test)

unittest.main()

There is only one instance of i, so it's not clear what you expect.
Since it's not an argument to test(), it has to be found in the closure
to the function. In this case, that's the global namespace. So each
time the function is called, it fetches that global.

To put it another way, you're storing the same function object 5 times.
If you need to have separate function objects that already know a
value for i, you need to somehow bind the value into the function object.

One way to do it, as you say, is with default parameters. A function's
default parameters are each stored in the object, because they're
defined to be evaluated only once. That's sometimes considered a flaw,
such as when they're volatile, and subsequent calls to the function use
the same value. But in your case, it's a feature, as it provides a
standard place to store values as known at function definition time.

The other way to do it is with functions.partial(). I can't readily
write you sample code, as I haven't messed with it in the case of class
methods, but partial is generally a way to bind one or more values into
the actual object. I also think it's clearer than the default parameter
approach.


Notice that globals may be defined after a function that references
them, which is a way of cross-checking the logic you already discovered.
The names are only looked up when the function is actually called.

This same logic applies to nested functions; the class definition is an
unnecessary complication; of course I understand it's needed for unittest.

The main place where I see this type of problem is in a gui, where
you're defining a callback to be used by a series of widgets, but you
have a value that IS different for each item in the series. You write a
loop much like you did, and discover that the last loop value is the
only one used. The two cures above work, and you can also use lambda
creatively.
 
T

Terry Reedy

Code examples are Python 3

Below you will find example code distilled from a set of unit tests,
usable with Python 2 or 3. I'm using a loop over a list of parameters to
generate tests with different permutations of parameters. Instead of
calling util() with values 0-4 as I would expect, each call uses the
same parameter 4. What I found out is that the name 'i' is resolved when
Foo.test_1 is called

Names* in Python code are resolved when the code is executed.
Function bodies are executed when the function is called.
Ergo, names in function bodies are resolved when the function is called.
This is sometimes called late binding.

* This may exclude keyword names.

Late binding of global names within functions is why the following can
work instead of raising NameError.
3

Only the most recent binding of x, at the time of the call matters, as
long as there is one. Does the following really surprise you?
3

What do you expect this to print?

Rolling the repeated code into a loop does not magically change the
behavior of def statements.

for i in range(1, 3):
exec('''\
x = {0}
def f{0}(): print(x)'''.format(i))

x = 3
print((f1(), f2()))

This gives *exactly* the same output.

So does this:

from textwrap import dedent

for i in range(1, 3):
exec(dedent('''
x = {0}
def f{0}():
print(x)
'''.format(i)))

x = 3
print((f1(), f2()))


Python does not do text substitution unless you explicit ask it too, as
I did above.

Late binding is also why functions (and methods, such as .__init__) can
call functions (methods) whose definitions follow later in the code, so
don't change that this change ;-).
and not substituted inside the for-loop,
Now, I'm still not sure how to best solve this problem:
* Spell out all permutations is a no-go.
* Testing the different iterations inside a single test, is
inconvenient because I want to know which permutation exactly fails and

A good test framework should give specifics as to the failure. The
unittest assertxxx methods do this. In fact, emitting specific messages
is one reason there are so many methods.

The real 'problem' with multiple tests within a test function is that
the first failure ends that group of tests. But this is only a problem
during development when there *are* failures. And it is possible to
write a test function to run all tests and collect multiple error
messages before 'failing' the test.
which others don't. Further, I want to be able to run just that one
because the tests take time.

Whether multiple tests are buried within one function or many, running
just one of them will require some editing.
* Further, I could generate local test() functions using the current
value of 'i' as default for a parameter, which is then used in the call
to self.util(), but that code is just as non-obviously-to-me correct as
the current code is non-obviously-to-me wrong.

LOL. You know the easiest and correct solution, but reject it because it
is not 'obvious' - though it was obvious enough for you to see it.

If one understands that function definition are executable statements
and that their execution is not magically changed by putting them inside
loops, the problem with your code should be obvious. It creates 5
*identical* functions objects. So it should not be surprising that they
behave identically.
I'd prefer something more stable.

The fact that default arg expressions are evaluated when the function is
defined is quite stable. Ain't gonna change.
Any other suggestions?

Revise your obvious meter ;-).
# example code
from __future__ import print_function
import unittest

class Foo(unittest.TestCase):
def util(self, param):
print('util({}, {})'.format(self, param))

for i in range(5):
def test(self):
self.util(param=i)

Executing this n times produces n identical functions. The easy fix is

def test(self, j = i): self.util(param = j)
setattr(Foo, 'test_{}'.format(i), test)

Another fix that should work: adapt my code above and use exec within a
loop within the class statement itself (and delete setattr).

for i in range(5):
exec(dedent('''
def test_{0}(self):
self.util(param={0})
'''.format(i)))
 
U

Ulrich Eckhardt

Dave and Terry,

Thanks you both for your explanations! I really appreciate the time you
took.

Am 05.02.2013 19:07, schrieb Dave Angel:
If you need to have separate function objects that already know a
value for i, you need to somehow bind the value into the function object.

One way to do it, as you say, is with default parameters. A function's
default parameters are each stored in the object, because they're
defined to be evaluated only once. That's sometimes considered a flaw,
such as when they're volatile, and subsequent calls to the function use
the same value. But in your case, it's a feature, as it provides a
standard place to store values as known at function definition time.

Yes, that was also the first way I found myself. The reason I consider
this non-obvious is that it creates a function with two parameters (one
with a default) while I only want one with a single parameter. This is
to some extent a bioware problem and/or a matter of taste, both for me
and for the other audience that I'm writing the code for.

The other way to do it is with functions.partial(). I can't readily
write you sample code, as I haven't messed with it in the case of class
methods, but partial is generally a way to bind one or more values into
the actual object. I also think it's clearer than the default parameter
approach.

Partial would be clearer, since it explicitly binds the parameters:

import functools

class Foo(object):
def function(self, param):
print('function({}, {})'.format(self, param))
Foo.test = functools.partial(Foo.function, param=1)

f = Foo()
Foo.test(f) # works
f.test() # fails

I guess that Python sees "Foo.test" and since it is not a (nonstatic)
function, it doesn't create a bound method from this. Quoting the very
last sentence in the documentation: "Also, partial objects defined in
classes behave like static methods and do not transform into bound
methods during instance attribute look-up."

The plain-Python version mentioned in the functools documentation does
the job though, so I'll just use that with a fat comment. Also, after
some digging, I found http://bugs.python.org/issue4331, which describes
this issue. There is a comment from Jack Diederich from 2010-02-23 where
he says that using lambda or a function achieves the same, but I think
that this case shows that this is not the case.

I'm also thinking about throwing another aspect in there: Unless you're
using exec(), there is no way to put any variables as constants into the
function, i.e. to enforce early binding instead of the default late
binding. Using default parameters or functools.partial are both just
workarounds with limited applicability. Also, binding the parameters now
instead of later would reduce size and offer a speedup, so it could be a
worthwhile optimization.

The main place where I see this type of problem is in a gui, where
you're defining a callback to be used by a series of widgets, but you
have a value that IS different for each item in the series. You write a
loop much like you did, and discover that the last loop value is the
only one used. The two cures above work, and you can also use lambda
creatively.

Careful, lambda does not work, at least not easily! The problem is that
lambda only creates a local, anonymous function, but any names used
inside this function will only be evaluated when the function is called,
so I'm back at step 1, just with even less obvious code.


Greetings!

Uli
 
U

Ulrich Eckhardt

Heureka!

Am 06.02.2013 15:37, schrieb Dave Angel:
def myfunc2(i):
def myfunc2b():
print ("myfunc2 is using", i)
return myfunc2b

Earlier said:
There is only one instance of i, so it's not clear what you expect.
Since it's not an argument to test(), it has to be found in the
closure to the function. In this case, that's the global namespace.
So each time the function is called, it fetches that global.

Actually, the important part missing in my understanding was the full
meaning of "closure" and how it works in Python. After failing to
understand how the pure Python version of functools.partial worked, I
started a little search and found e.g. "closures-in-python"[1], which
was a key element to understanding the whole picture.

Summary: The reason the above or the pure Python version work is that
they use the closure created by a function call to bind the values in.
My version used a loop instead, but the loop body does not create a
closure, so the effective closure is the surrounding global namespace.

:)

Uli


[1] http://ynniv.com/blog/2007/08/closures-in-python.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,737
Latest member
Georgeengab

Latest Threads

Top