Anand said:
This was exactly what my problem was. The producer thread was
creating a lot of data to the queue which the consumer cannot
keep up with.
Since I am using python threads, it so happens most of the time
that the queue is filled to huge sizes, and the consumer thread
context switch is not done, causing memory problems.
I have not gone through the code of Queue.py, but I think that
the queue.get() method is an O(n) operation, considering the
ration of 'put's to the 'get's.
As Peter already remarked, the source of the your problem is not
the get() operation.
A related question here is... How do you avoid producer thread
deadlocking when the consumer queue is full?
By using a separate thread for the consumer(s), as you indicate
below.
If the question is not clear, my queue is of size 'n'. It is already
full with 'n' pieces of data inside it. All of my 'n' producer threads
are trying to push data into it at the *same* time and there is no
consumer thread popping the data off (In my program, the consumers
are the producers themselves, a bad design I agree...).
That's probably the source of your problem.
This is clearly a deadlock situtation, since no thread will enter the
queue forever. I solved it by creating a new consumer thread to pop
off some data, when deadlock occurs. But sometimes the deadlock
occurs very often resulting in many extra threads being created like this.
The other approach is to write separate out consumer and producer thread
roles so that there is always one consumer thread somewhere which is
popping data off the queue, which is what I have done now.
Anyway how do you make sure that dead lock situations not occur?
When there is a cycle between the producers and the consumers, ie.
back from the consumers to the producers, you may need one queue
(eg. the backwards queue) of infinite size to prevent deadlock.
However, you should make sure that no thread accesses a queue
in the wrong direction, because that might still cause deadlock.
In case you have a fixed ratio between the amounts of things
on the various queues, you might consider not using queues at all
for these fixed ratios.
Can you write some Event() based mechanisms to make sure that the
threads always access the queue at different times? Since python offers
A queue provides synchronized access already, you don't need
anything else.
no control over thread priority or mechanisms to suspend a thread, how
else could this be done?
You can make reasonably sure that work is going on everywhere by keeping the
queues small enough. This forces more thread switching, though.
That brings me to a favorite topic, to implement a thread wrapper
over the existing python threading API to allow suspending of threads
and modifying thread priority. Has this been considered as a PEP anytime
before?
Thread suspension is (very) difficult to implement correctly.
Thread priorities are good eg. when you have background work going
on and you want some short interactive pieces of work done at normal
speed. Also, if you have a series of threads and queues it can
be helpful to give the first thread a higher priority when it also has
to wait for external events.
I don't know whether there was a PEP for any of this.
Kind regards,
Ype