Confusion with weakref, __del__ and threading

G

George Sakkis

I'm baffled with a situation that involves:
1) an instance of some class that defines __del__,
2) a thread which is created, started and referenced by that instance,
and
3) a weakref proxy to the instance that is passed to the thread
instead of 'self', to prevent a cyclic reference.

This probably sounds like gibberish so here's a simplified example:

==========================================

import time
import weakref
import threading

num_main = num_other = 0
main_thread = threading.currentThread()


class Mystery(object):

def __init__(self):
proxy = weakref.proxy(self)
self._thread = threading.Thread(target=target, args=(proxy,))
self._thread.start()

def __del__(self):
global num_main, num_other
if threading.currentThread() is main_thread:
num_main += 1
else:
num_other += 1

def sleep(self, t):
time.sleep(t)


def target(proxy):
try: proxy.sleep(0.01)
except weakref.ReferenceError: pass


if __name__ == '__main__':
for i in xrange(1000):
Mystery()
time.sleep(0.1)
print '%d __del__ from main thread' % num_main
print '%d __del__ from other threads' % num_other

==========================================

When I run it, I get around 950 __del__ from the main thread and the
rest from non-main threads. I discovered this accidentally when I
noticed some ignored AssertionErrors caused by a __del__ that was
doing "self._thread.join()", assuming that the current thread is not
self._thread, but as it turns out that's not always the case.

So what is happening here for these ~50 minority cases ? Is __del__
invoked through the proxy ?

George
 
R

Rhamphoryncus

I'm baffled with a situation that involves:
1) an instance of some class that defines __del__,
2) a thread which is created, started and referenced by that instance,
and
3) a weakref proxy to the instance that is passed to the thread
instead of 'self', to prevent a cyclic reference.

This probably sounds like gibberish so here's a simplified example:

==========================================

import time
import weakref
import threading

num_main = num_other = 0
main_thread = threading.currentThread()

class Mystery(object):

def __init__(self):
proxy = weakref.proxy(self)
self._thread = threading.Thread(target=target, args=(proxy,))
self._thread.start()

def __del__(self):
global num_main, num_other
if threading.currentThread() is main_thread:
num_main += 1
else:
num_other += 1

def sleep(self, t):
time.sleep(t)

def target(proxy):
try: proxy.sleep(0.01)
except weakref.ReferenceError: pass

if __name__ == '__main__':
for i in xrange(1000):
Mystery()
time.sleep(0.1)
print '%d __del__ from main thread' % num_main
print '%d __del__ from other threads' % num_other

==========================================

When I run it, I get around 950 __del__ from the main thread and the
rest from non-main threads. I discovered this accidentally when I
noticed some ignored AssertionErrors caused by a __del__ that was
doing "self._thread.join()", assuming that the current thread is not
self._thread, but as it turns out that's not always the case.

So what is happening here for these ~50 minority cases ? Is __del__
invoked through the proxy ?

The trick here is that calling proxy.sleep(0.01) first gets a strong
reference to the Mystery instance, then holds that strong reference
until it returns.

If the child thread gets the GIL before __init__ returns it will enter
Mystery.sleep, then the main thread will return from Mystery.__init__
and release its strong reference, followed by the child thread
returning from Mystery.sleep, releasing its strong reference, and (as
it just released the last strong reference) calling Mystery.__del__.

If the main thread returns from __init__ before the child thread gets
the GIL, it will release the only strong reference to the Mystery
instance, causing it to clear the weakref proxy and call __del__
before the child thread ever gets a chance. If you added counters to
the target function you should see them match the counters of the
__del__ function.

Incidentally, += 1 isn't atomic in Python. It is possible for updates
to be missed.
 
G

George Sakkis

The trick here is that calling proxy.sleep(0.01) first gets a strong
reference to the Mystery instance, then holds that strong reference
until it returns.

Ah, that was the missing part; I thought that anything accessed
through a proxy didn't create a strong reference. The good thing is
that it seems you can get a proxy to a bounded method and then call it
without creating a strong reference to 'self':

num_main = num_other = 0
main_thread = threading.currentThread()

class MysterySolved(object):

def __init__(self):
sleep = weakref.proxy(self.sleep)
self._thread = threading.Thread(target=target, args=(sleep,))
self._thread.start()

def __del__(self):
global num_main, num_other
if threading.currentThread() is main_thread:
num_main += 1
else:
num_other += 1

def sleep(self, t):
time.sleep(t)


def target(sleep):
try: sleep(0.01)
except weakref.ReferenceError: pass


if __name__ == '__main__':
for i in xrange(1000):
MysterySolved()
time.sleep(.1)
print '%d __del__ from main thread' % num_main
print '%d __del__ from other threads' % num_other

==========================================
Output:
1000 __del__ from main thread
0 __del__ from other threads


Thanks a lot, I learned something new :)

George
 
R

Rhamphoryncus

Ah, that was the missing part; I thought that anything accessed
through a proxy didn't create a strong reference. The good thing is
that it seems you can get a proxy to a bounded method and then call it
without creating a strong reference to 'self':

That's not right. Of course a bound method has a strong reference to
self, otherwise you'd never be able to call it. There must be
something else going on here. Try using sys.setcheckinterval(1) to
make threads switch more often.
 
G

George Sakkis

That's not right. Of course a bound method has a strong reference to
self, otherwise you'd never be able to call it. There must be
something else going on here. Try using sys.setcheckinterval(1) to
make threads switch more often.

I tried that and it still works; all objects die at the main thread.
Any other idea to break it ?

George
 
R

Rhamphoryncus

I tried that and it still works; all objects die at the main thread.
Any other idea to break it ?

Nothing comes to mind.

However, none of this is guaranteed anyway, so you can't rely on it.
__del__ might be called by any thread at any point (even when you're
holding a lock, updating a datastructure.)
 
G

George Sakkis

Nothing comes to mind.

However, none of this is guaranteed anyway, so you can't rely on it.
__del__ might be called by any thread at any point (even when you're
holding a lock, updating a datastructure.)

Ok, you scared me enough to cut the link to the thread altogether,
avoiding the cyclic reference and the weakref proxy. The original
intent for keeping a reference to the thread was to be able to join()
it later before some cleanup takes place. I have since found a less
spooky way to synchronize them but regardless, I'd be interested for
an explanation of what's really going on in the posted snippets.

George
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top