Puzzling: local variable in recursive function made global?

D

Daniel Oberski

Hello all,


I wrote this function to recurse through a tree structure of Nodes
connected by Branches.

I use a local variable seen_nodes to mark off Nodes already seen by the
function (i.e. add them to a list).

What is puzzling me is that the local variable seen_nodes seems to be
made global by Python. That is, Nodes from function calls 'lower' in the
recursion order are still there after return. Even Nodes added by
completely unrelated function calls on a different graph in the same
program are in seen_nodes. This can only mean the variable is global, but
I absolutely did not expect it nor do I understand why that should be.

The solution was to delete seen Nodes before return, so I do not have a
problem with the program. I am just very surprised by this behaviour!

I was wondering if anybody can explain to me why the interpreter should
behave in this way and where this is documented. The complete program
with usage examples as a doctest is listed below.

Thank you very much in advance,

Daniel


----------------- graph.py -------------------
"""
# The null graph:
.... # --> n3
.... # n1 --> n2 --|
.... # --> n4 --> n5
.... n1 = Node('Start')
1

"""

# Total number of times the recursive function is_terminal has been
called:
num_calls = 0


class Node:

def __init__(self, name):
self.name = name
self.branches = []

def add(self, branch):
"""Adds a branch to this node."""
self.branches.append(branch)

def is_terminal(self, seen_nodes = []):
"""Recurses from this node to see if *all* its branches eventually
end up in a terminal node. Calling this on the source node of
a
graph establishes that the graph is terminal.
Returns 1 if yes, 0 if no. """
global num_calls # For debugging
num_calls += 1

if self in seen_nodes: # deja vu, we are going in circles
return 0
seen_nodes.append(self)
# unfortunately seen_nodes is automatically made global by the
Python
# interpreter :-(. Instead of using black scoping magic I just
delete self
# from seen_nodes later on.

num_terminal = 0
for branch in self.branches:
if branch.next: # Recurse through this branch
num_terminal += branch.next.is_terminal(seen_nodes)
else: # Found the terminal node ('sink')
num_terminal += 1
# necessary because of Python's weird scoping rules (see above):
seen_nodes.remove(self)

if num_terminal == len(self.branches): # all branches are terminal
return 1
else: # at least one branch is not terminal or no branches
return 0


class Branch:
"""A branch is an edge. It is found in a Node and points to what we
can only
hope is another Node."""
next = False

def __init__(self, next = False):
self.next = next

if __name__ == "__main__":
import doctest
doctest.testmod()

print "is_terminal() was called a total of %d times." % num_calls
 
P

Peter Otten

andrew said:
Diez said:
That's not a local variable, that is a default argument. Which is in
fact only created once for each function, yes.

http://effbot.org/pyfaq/why-are-default-values-shared-between-objects.htm

a nice way of handling this was posted here just yesterday, which isn't in
the ffbot page (afaik):

def function(listvar=None):
# None will force use of empty list here:
for x in listvar or []:
# Do soemthing with contents here

or just

def function(listvar=None):
listvar = listvar or []
...

although the "if arg is None..." is pretty standard python that makes it
clear exactly what you are doing.

Plus, it works as expected (read: modifies the argument) if you explicitly
pass an empty list to the function...

Peter
 
D

Daniel Oberski

Hi Andrew,

it's not global, it's mutable. you are passing THE SAME FRICKING OBJECT
to is_terminal and appending values to it.

That, I understand. I already saw in the archives this confuses many
people, e.g. the thread "Odd behavior regarding a list". I think this is
understandable if you did not read the documentation's small print about
mutable objects. However, that was not my problem in this case.

What I was surprised about is that this object persists across calls to
is_terminal(). Diez now showed me why that is: the def statement creates
a persistent object upon the first time it is called.
sorry for the shouting, but someone asks this EVERY DAY AND I CAN'T TAKE
ANY MORE.

No problem, thanks for answering anyway.

all the best,

daniel
 
D

Daniel Oberski

Hi Peter,
Plus, it works as expected (read: modifies the argument) if you
explicitly pass an empty list to the function...

That is not so. The reason is given by Andrew Cooke in this thread.

I would "expect" that when function calls lower in the recursion
hierarchy return, the object is not changed from the point of view of the
higher-order calls. This is expectation is mistaken because it is "the
same frickin object" as Andrew said. One thing that goes wrong for
example is the last unit test given in the program.

This will happen even if an empty list is explicitly passed in the
initial call...

Thanks to everyone for responding, this really helped me understand
better!

Best regards, Daniel
 
M

MRAB

andrew said:
it's not global, it's mutable. you are passing THE SAME FRICKING OBJECT
to is_terminal and appending values to it.

instead, make a new copy:

branch.next.is_terminal(list(seen_nodes))

sorry for the shouting, but someone asks this EVERY DAY AND I CAN'T TAKE
ANY MORE.
[snip]
Perhaps as part of the Google Summer of Code you should write a script
that analyses the posts in this newsgroup and replies automatically to
such posts. :)
 
P

Peter Otten

Daniel said:
Hi Peter,


That is not so. The reason is given by Andrew Cooke in this thread.

I would "expect" that when function calls lower in the recursion
hierarchy return, the object is not changed from the point of view of the
higher-order calls. This is expectation is mistaken because it is "the
same frickin object" as Andrew said. One thing that goes wrong for
example is the last unit test given in the program.

This will happen even if an empty list is explicitly passed in the
initial call...


As someone who knows Python I would expect that a function like

.... if items is None:
.... items = []
.... items.append(item)
.... return items
....

always modifies the list argument:
a = []
append("x", a) ['x']
a # a is modified ['x']
b = ["x"]
append("x", b) ['x', 'x']
b # b is modified
['x', 'x']

I would be surprised by the behaviour of Andrew's variant:
.... items = items or []
.... items.append(item)
.... return items
....
a = []
append("x", a) ['x']
a # a is left alone []
b = ["x"]
append("x", b) ['x', 'x']
b # but b is not
['x', 'x']

This is why I prefer a test for the None sentinel over the boolean test.

Peter
 
T

Terry Reedy

If the order of node-entry into seen_nodes is never used (if particular,
if 'node in seen_nodes' is its only usage), then seen_nodes could be a
set and the 'in' operation would be O(1) instead of O(n).
 
A

Aahz

sorry for the shouting, but someone asks this EVERY DAY AND I CAN'T TAKE
ANY MORE.

Nobody's forcing you to respond. Nobody's forcing you to top-post,
either.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it." --Brian W. Kernighan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top