M
Michael Bayer
Hi -
i was just going through this thread: http://mail.python.org/
pipermail/python-list/2006-April/336948.html , where it is suggested
that the Lock instance used by Queue.Queue should be publically
configurable. I have identified another situation where a Queue can
be deadlocked, one which is also alleviated by configuring the type
of Lock used by the Queue (or just changing it to an RLock).
The scenario arises when the Queue is operated upon within the
__del__ method of an object; since __del__ can be called at somewhat
unpredictable times, I have observed that it is in fact possible, in
extremely rare circumstances, for put() to be called within a get()
or possibly vice versa; since both methods lock on the same
underlying mutex object which is an instance of threading.Lock, a
deadlock occurs.
The issue can be fixed by substituting a threading.RLock for the
threading.Lock object that Queue instantiates by default.
The scenario this has arisen within is a database connection pool,
which puts connections in a Queue, returns them via get() within a
wrapper object, and the wrapper object automatically returns the
connection to the Queue via put() within its __del__ method (an
explicit close() method is available as well). While I cant
reproduce it locally, one of my users experiences it regularly. I
had him install the "threadframe" module to trace it out, and it
reveals that all threads are hung within Queue on the acquiring of
the "not_empty" and "not_full" Conditionals, and the offending stack
trace within it looks like this:
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 84, in
connect
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 130, in
__init__
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 102, in
get
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 226, in
do_get
File "/usr/lib/python2.4/Queue.py", line 116, in get
raise Empty
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 157, in
__del__
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 163, in
_close
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 99, in
return_conn
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 216, in
do_return_conn
File "/usr/lib/python2.4/Queue.py", line 71, in put
self.not_full.acquire()
this is a simplified version of the logic, the actual version is the
pool.py module in the SQLAlchemy package:
import Queue
pool = Queue.Queue(maxsize=10)
class ConnectionWrapper(object):
def __init__(self, connection):
self.connection = connection
def __del__(self):
pool.put_nowait(self)
# fill up the pool with 10 connections
for x in range(0,10):
pool.put_nowait(database.connect())
def connect():
return ConnectionWrapper(pool.get())
At the moment I am modifying the Queue's mutex to be a
threading.RLock to fix the problem; what does the community think of
either making the Queue's Lock instance public or changing it to an
RLock ?
- mike
i was just going through this thread: http://mail.python.org/
pipermail/python-list/2006-April/336948.html , where it is suggested
that the Lock instance used by Queue.Queue should be publically
configurable. I have identified another situation where a Queue can
be deadlocked, one which is also alleviated by configuring the type
of Lock used by the Queue (or just changing it to an RLock).
The scenario arises when the Queue is operated upon within the
__del__ method of an object; since __del__ can be called at somewhat
unpredictable times, I have observed that it is in fact possible, in
extremely rare circumstances, for put() to be called within a get()
or possibly vice versa; since both methods lock on the same
underlying mutex object which is an instance of threading.Lock, a
deadlock occurs.
The issue can be fixed by substituting a threading.RLock for the
threading.Lock object that Queue instantiates by default.
The scenario this has arisen within is a database connection pool,
which puts connections in a Queue, returns them via get() within a
wrapper object, and the wrapper object automatically returns the
connection to the Queue via put() within its __del__ method (an
explicit close() method is available as well). While I cant
reproduce it locally, one of my users experiences it regularly. I
had him install the "threadframe" module to trace it out, and it
reveals that all threads are hung within Queue on the acquiring of
the "not_empty" and "not_full" Conditionals, and the offending stack
trace within it looks like this:
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 84, in
connect
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 130, in
__init__
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 102, in
get
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 226, in
do_get
File "/usr/lib/python2.4/Queue.py", line 116, in get
raise Empty
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 157, in
__del__
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 163, in
_close
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 99, in
return_conn
File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 216, in
do_return_conn
File "/usr/lib/python2.4/Queue.py", line 71, in put
self.not_full.acquire()
this is a simplified version of the logic, the actual version is the
pool.py module in the SQLAlchemy package:
import Queue
pool = Queue.Queue(maxsize=10)
class ConnectionWrapper(object):
def __init__(self, connection):
self.connection = connection
def __del__(self):
pool.put_nowait(self)
# fill up the pool with 10 connections
for x in range(0,10):
pool.put_nowait(database.connect())
def connect():
return ConnectionWrapper(pool.get())
At the moment I am modifying the Queue's mutex to be a
threading.RLock to fix the problem; what does the community think of
either making the Queue's Lock instance public or changing it to an
RLock ?
- mike