Destruction of generator objects

S

Stefan Bellon

Hi all,

I'm generating a binding from Python to C using SWIG. On the C side I
have iterators over some data structures. On the Python side I
currently use code like the following:

def get_data(obj):
result = []
iter = make_iter(obj)
while more(iter):
item = next(iter)
result.append(item)
destroy(iter)
return result

Now I'd like to transform it to a generator function like the following
in order to make it more memory and time efficient:

def get_data(obj):
iter = make_iter(obj)
while more(iter):
yield next(iter)
destroy(iter)

But in the generator case, I have a problem if the generator object is
not iterated till the StopIteration occurs, but if iteration is stopped
earlier. In that case, the C iterator's destroy is not called, thus the
resource is not freed.

Is there a way around this? Can I add some sort of __del__() to the
generator object so that in case of an early destruction of the
generator object, the external resource is freed as well?

I'm looking forward to hearing your hints!
 
M

MRAB

Hi all,

I'm generating a binding from Python to C using SWIG. On the C side I
have iterators over some data structures. On the Python side I
currently use code like the following:

def get_data(obj):
result = []
iter = make_iter(obj)
while more(iter):
item = next(iter)
result.append(item)
destroy(iter)
return result

Now I'd like to transform it to a generator function like the following
in order to make it more memory and time efficient:

def get_data(obj):
iter = make_iter(obj)
while more(iter):
yield next(iter)
destroy(iter)

But in the generator case, I have a problem if the generator object is
not iterated till the StopIteration occurs, but if iteration is stopped
earlier. In that case, the C iterator's destroy is not called, thus the
resource is not freed.

Is there a way around this? Can I add some sort of __del__() to the
generator object so that in case of an early destruction of the
generator object, the external resource is freed as well?

I'm looking forward to hearing your hints!

Simple! :)

def get_data(obj):
iter = make_iter(obj)
try:
while more(iter):
yield next(iter)
finally:
destroy(iter)
 
G

Graham Dumpleton

Hi all,

I'm generating a binding from Python to C using SWIG. On the C side I
have iterators over some data structures. On the Python side I
currently use code like the following:

def get_data(obj):
result = []
iter = make_iter(obj)
while more(iter):
item = next(iter)
result.append(item)
destroy(iter)
return result

Now I'd like to transform it to a generator function like the following
in order to make it more memory and time efficient:

def get_data(obj):
iter = make_iter(obj)
while more(iter):
yield next(iter)
destroy(iter)

But in the generator case, I have a problem if the generator object is
not iterated till the StopIteration occurs, but if iteration is stopped
earlier. In that case, the C iterator's destroy is not called, thus the
resource is not freed.

Is there a way around this? Can I add some sort of __del__() to the
generator object so that in case of an early destruction of the
generator object, the external resource is freed as well?

I'm looking forward to hearing your hints!

Perhaps read through:

http://www.python.org/dev/peps/pep-0325/
http://www.python.org/dev/peps/pep-0342/

The latter superseding the first.

Based on these what the WSGI specification:

http://www.python.org/dev/peps/pep-0333/

did was mandate that the consumer of data from the generator
explicitly call the close() method on the generator no matter whether
all data was consumed or not, or whether an exception occurred.

result = application(environ, start_response)
try:
for data in result:
if data: # don't send headers until body appears
write(data)
if not headers_sent:
write('') # send headers now if body was empty
finally:
if hasattr(result,'close'):
result.close()

That the consumer called close() allowed generators to provide a
close() method to cleanup resources even though older version of
Python was being used which didn't support automatic means of close()
being called. In other words, it allowed the required cleanup and was
forward compatible with newer versions of Python.

Graham
 
S

Stefan Bellon

result = application(environ, start_response)
try:
for data in result:
if data: # don't send headers until body appears
write(data)
if not headers_sent:
write('') # send headers now if body was empty
finally:
if hasattr(result,'close'):
result.close()

Hm, not what I hoped for ...

Isn't it possible to add some __del__ method to the generator object
via some decorator or somehow else in a way that works even with Python
2.4 and can then be nicely written without cluttering up the logic
between consumer and producer?
 
A

Alex Martelli

Stefan Bellon said:
Hm, not what I hoped for ...

Isn't it possible to add some __del__ method to the generator object
via some decorator or somehow else in a way that works even with Python
2.4 and can then be nicely written without cluttering up the logic
between consumer and producer?

No, you cannot do what you want in Python 2.4. If you can't upgrade to
2.5 or better, whatever the reason may be, you will have to live with
2.4's limitations (there ARE reasons we keep making new releases, after
all...:).


Alex
 
K

Kay Schluehr

Sorry, I forgot to mention that I am forced to using Python 2.4.

It doesn't matter. You can use try...finally as well in Python 2.4.
It's just not possible to use except and finally clauses in one
statement such as:

try:
1/0
except ZeroDivisionError:
print "incident!"
finally:
print "cleanup"

However in theory you don't really need finally but you can simulate
it using nested try...except statements:

-----

try:
try:
1/0
try:
print "cleanup"
except Exception:
raise FinallyException( sys.exc_info() )
except ZeroDivisionError:
print "incident!"
try:
print "cleanup"
except Exception:
raise FinallyException( sys.exc_info() )
except Exception:
exc_cls, exc_value, exc_tb = sys.exc_info()
if exc_cls == FinallyException:
fin_exc_cls, fin_exc_value, fin_exc_tb = exc_value[0]
raise fin_exc_cls, fin_exc_value, fin_exc_tb
else:
print "cleanup"
except Exception:
raise FinallyException( sys.exc_info() )
raise exc_cls, exc_value, exc_tb

-------

Note that this expression is regenerated from the above
try...except..finally statement using Py25Lite ( see [1],[2] ) which
is a tool used to provide some Python 2.5 constructs for programmers
working with Python 2.4.

[1] http://www.fiber-space.de/EasyExtend/doc/EE.html
[2] http://www.fiber-space.de/EasyExtend/doc/Py25Lite/Py25Lite.html
 
S

Stefan Bellon

It doesn't matter. You can use try...finally as well in Python 2.4.
It's just not possible to use except and finally clauses in one
statement such as:

[snip]

The problem is yield inside try-finally which is not possible in 2.4.
 
K

Kay Schluehr

It doesn't matter. You can use try...finally as well in Python 2.4.
It's just not possible to use except and finally clauses in one
statement such as:

[snip]

The problem is yield inside try-finally which is not possible in 2.4.

Oh, yeah... i overlooked this.

Honestly, I'd recommend wrapping the generator into a function object,
create the resource on construction ( or pass it ) and destroy it
implementing __del__.

def gen_value(self):
while True:
yield self.iter.next()

class GeneratorObj(object):
def __init__(self, obj, gen):
self.iter = make_iter(obj)
self.gen = gen(self)

def __del__(self):
destroy(self.iter)

def next(self):
return self.gen.next()
 
S

Stefan Bellon

Honestly, I'd recommend wrapping the generator into a function object,
create the resource on construction ( or pass it ) and destroy it
implementing __del__.

def gen_value(self):
while True:
yield self.iter.next()

class GeneratorObj(object):
def __init__(self, obj, gen):
self.iter = make_iter(obj)
self.gen = gen(self)

def __del__(self):
destroy(self.iter)

def next(self):
return self.gen.next()

Ok, I think there is an __iter__ missing in order to implement the
iterator protocol, and I don't see why the generator cannot be inside
the class itself.

Anyway, I came up with this code now:

class ListGenerator(object):
def __init__(self, iter):
print "gen init"
self.iter = iter
self.gen = self._value()

def __del__(self):
print "gen del"
destroy(self.iter)

def _value(self):
print "gen value"
while more(self.iter):
yield next(self.iter)

def __iter__(self):
print "gen iter"
return self

def next(self):
print "gen next"
return self.gen.next()

When iterating over such a generator, I see the following output:
gen init
gen iter
gen next
gen value
gen next
gen next
gen next
gen next
['Item1', 'Item2', 'Item3', 'Item4']

But I do not see an output of "gen del" which makes me think that the
destructor is not called, thus not releasing the resource. It seems I
have not completely understood how generators work ...
 
K

Kay Schluehr

Honestly, I'd recommend wrapping the generator into a function object,
create the resource on construction ( or pass it ) and destroy it
implementing __del__.
def gen_value(self):
while True:
yield self.iter.next()
class GeneratorObj(object):
def __init__(self, obj, gen):
self.iter = make_iter(obj)
self.gen = gen(self)
def __del__(self):
destroy(self.iter)
def next(self):
return self.gen.next()

Ok, I think there is an __iter__ missing in order to implement the
iterator protocol, and I don't see why the generator cannot be inside
the class itself.
Sure.

[...]

But I do not see an output of "gen del" which makes me think that the
destructor is not called, thus not releasing the resource. It seems I
have not completely understood how generators work ...

But why shall the destructor be called? Your example does not indicate
that a ListGenerator object is somewhere destroyed neither explicitely
using del nor implicitely by destroying the scope it is living in.
 
S

Stefan Bellon

But why shall the destructor be called? Your example does not indicate
that a ListGenerator object is somewhere destroyed neither explicitely
using del nor implicitely by destroying the scope it is living in.

After having constructed the list itself, the generator is exhausted
and not iterated or referenced anymore, so the generator should be
destroyed, shouldn't it?

Ok, let's make the example easier and take out the external iterator
resource and just concentrate on the Python part:

class ListGenerator(object):
def __init__(self):
print "gen init"
self.gen = self._value()

def __del__(self):
print "gen del"

def _value(self):
print "gen value"
for i in xrange(4):
yield i

def __iter__(self):
print "gen iter"
return self

def next(self):
print "gen next"
return self.gen.next()

Now, doing the following:
....
gen iter
gen next
gen value
0
gen next
1
gen next
2
gen next
3
gen next
So why is the destructor not called when the generator is even
explicitly 'del'ed? Does somebody else still hold a reference on it?
But then, even when terminating the interpreter, __del__ is not called.
When taking out the yield statement, __del__ is called again. It looks
to me that as soon as a generator function is involved in the class,
the __del__ is not called anymore.
 
M

Marc 'BlackJack' Rintsch

After having constructed the list itself, the generator is exhausted
and not iterated or referenced anymore, so the generator should be
destroyed, shouldn't it?

Ok, let's make the example easier and take out the external iterator
resource and just concentrate on the Python part:

class ListGenerator(object):
def __init__(self):
print "gen init"
self.gen = self._value()

def __del__(self):
print "gen del"

def _value(self):
print "gen value"
for i in xrange(4):
yield i

def __iter__(self):
print "gen iter"
return self

def next(self):
print "gen next"
return self.gen.next()

Now, doing the following:

...
gen iter
gen next
gen value
0
gen next
1
gen next
2
gen next
3
gen next

So why is the destructor not called when the generator is even
explicitly 'del'ed?

The generator is not ``del``\ed, just the name `a` is removed.
Does somebody else still hold a reference on it?

Yes, the interactive Python shell holds the last non-`None` result in `_`:
....
gen iter
gen next
gen value
0
gen next
1
gen next
2
gen next
3
gen nextgen del
42
But then, even when terminating the interpreter, __del__ is not called.

Because that is not guaranteed by the language reference. The reason why
it is a bad idea to depend on `__del__` for important resource management.

Ciao,
Marc 'BlackJack' Rintsch
 
S

Stefan Bellon

Because that is not guaranteed by the language reference. The reason
why it is a bad idea to depend on `__del__` for important resource
management.

Ok, but then we are back to my initial question of whether the destroy
of

def get_data(obj):
iter = make_iter(obj)
while more(iter):
yield next(iter)
destroy(iter)

can be guaranteed somehow in Python 2.4 while it can be done in Python
2.5 like follows:

def get_data(obj):
iter = make_iter(obj)
try:
while more(iter):
yield next(iter)
finally:
destroy(iter)

And then the answer is: no, it cannot be done in 2.4 (like Alex
Martelli already said ;-)
 
K

Kay Schluehr

So why is the destructor not called when the generator is even
explicitly 'del'ed? Does somebody else still hold a reference on it?

You ( we ) have produced a reference cycle. In that case __del__
doesn't work properly ( according to the docs ). The cycle is caused
by the following assignment in your code:

self.gen = self._value()

So we have to refactor the solution to eliminate the cycle. I spent
some time to create a generic decorator solution and a protocol for
handling / releasing the resources.

1) Create a FinalizerGenerator class that is cycle free ( of course
you can find a trick to shoot yourself in your foot ). Pass a
generator function together with its arguments into the constructor.
An additional callable attribute close is used together with __del__.

class FinalizerGenerator(object):
def __init__(self, gen, *args, **kwd):
print "gen init"
self._gen = gen # no cycle when passing the _value function
self.close = lambda: sys.stdout.write("gen del") # assign
cleanup function

def __del__(self):
self.close()

def __iter__(self):
print "gen iter"
return self

def next(self):
print "gen next"
return self.gen.next()

2) Define generators s.t. they yield the resource to be destroyed as
their first value.

def producer(resource):
print "gen value"
yield resource # yield resource before start iteration
for item in resource:
yield item

3) Define the destructor for the resource. The resource must be passed
as a first argument. The return value is a callable without arguments
that serves as a close() function within FinalizerGenerator.__del__
method.

def close_resource(resource):
return lambda: sys.stdout.write("close resource: %s"%resource)

4) The finalize_with decorator

def finalize_with(close_resource):
def closing(func_gen):
def fn(*args, **kwd):
# fg serves as a replacement for the original generator
func_def
fg = FinalizerGenerator(func_gen)(*args, **kwd)
# apply the closing protocol
resource = fg.next()
fg.close = close_resource(resource)
return fg
fn.__name__ = func_gen.__name__
return fn
return closing


5) decorate the generator and check it out

@finalize_with(close_resource)
def producer(resource):
print "gen value"
yield resource
for item in resource:
yield item

def test():
pg = producer([1,2,3])
pg.next()
gen init
gen next # request for resource in finalize_with
gen value
gen next
close resource: [1, 2, 3] # yep
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,824
Latest member
Nater888

Latest Threads

Top