Producer/consumer Queue "trick"

E

Evan Simpson

WEBoggle needs a new game board every three minutes. Boards take an
unpredictable (much less than 3min, but non-trivial) amount of time to
generate. The system is driven by web requests, and I don't want the
request that happens to trigger the need for the new board to have to
pay the time cost of generating it.

I set up a producer thread that does nothing but generate boards and put
them into a length-two Queue (blocking). At the rate that boards are
pulled from the Queue, it's almost always full, but my poor consumer
thread was still being blocked for "a long time" each time it fetched a
board.

At this point I realized that q.get() on a full Queue immediately wakes
up the producer, which has been blocked waiting to add a board to the
Queue. It sets about generating the next board, and the consumer
doesn't get to run again until the producer blocks again or is preempted.

The solution was simple: have the producer time.sleep(0.001) when
q.put(board) returns.

Cheers,

Evan @ 4-am
 
J

Jon Perez

I don't get it.

If the consumer and the producer are separate threads,
why does the consumer thread block when the producer
thread is generating a new board? Or why does it
take forever for the producer thread to be pre-empted?

Also, I don't understand why the solution works.
How does sleeping for .001 seconds right after putting
a new board on the queue solve the problem?
 
P

Paul Rubin

Evan Simpson said:
wakes up the producer, which has been blocked waiting to add a board
to the Queue. It sets about generating the next board, and the
consumer doesn't get to run again until the producer blocks again or
is preempted.

That's weird. Preemption should happen every few dozen milliseconds
unless you've purposely increased the preemption delay.
 
J

Jeremy Bowers

WEBoggle needs a new game board every three minutes. Boards take an
unpredictable (much less than 3min, but non-trivial) amount of time to
generate.

I gotta ask, why?

Looking over your information about "how to play", my only guess is that
you're generating all possible words that may exist in the board, at the
time of board generation.

But it also looks like you score once at the end (as you have to anyhow in
order to cancel out words found by multiple people, according to the rules
of Boggle).

If both of these statements are true, the generation of all possible words
is a waste of time. For each word given by a human, looking it up in the
dict and verifying it is on the board at score time should be a lot
faster, and not need any pre-processing.

If you're generating stats about what percentage of possible words were
found, that can also be done during game play without loss by handing
outh the board, and *then* finding all words. You still have a threading
problem, but now instead of dealing with human response times, you've got
a three minute deadline which ought to be enough.

(The other thing I can think of is that you are trying to verify that the
board contains some minimal number of words, in which case I submit that
boards with only 20-ish words is just part of the game :) I've never sat
down and really studied the Boggle dice, but I've always expected/hoped
that there is at least one or two dice with all vowels; even so the odds
of no vowels are small and easily algorithmically discarded. )

To be clear, I'm mostly curious what's going on, but it is possible that
the fundamental problem may be algorithmic, and since I don't know what's
going on it's worth checking.

Also, entirely separate plug, you may be interested in my XBLinJS project:
http://www.jerf.org/resources/xblinjs . (I'm expecting to release 0.2
either today or tomorrow, which will have vastly more documentation and
more online examples.) It'd help pull out relevant code so that it is
easily re-usable in any future gaming projects by others.
 
N

Nick Coghlan

Paul said:
That's weird. Preemption should happen every few dozen milliseconds
unless you've purposely increased the preemption delay.

To me, it smells like a call into a C extension which isn't releasing the GIL
before starting a time-consuming operation. After getting bitten by this a
couple of times, I now make sure to release the GIL as part of my SWIG wrapper
(since the code I'm wrapping knows nothing of Python, and sure as heck doesn't
need to be holding the GIL!).

Cheers,
Nick.
 
E

Evan Simpson

I should clarify up front that I may have given an overblown sense of
how long the producer thread typically takes to generate a board; It's
usually a few tenths of a second, up to a few seconds for especially
fecund boards.

My concern was that even a few seconds is long enough for fifty requests
to get piled up, and I was experiencing mysterious breakdowns where
Apache was suddenly totally clogged up and taking *minutes* to respond.

Jeremy said:
Looking over your information about "how to play", my only guess is that
you're generating all possible words that may exist in the board, at the
time of board generation.

Yep. I do this in order to minimize the cost of the most common request
that WEBoggle handles, which is checking a submitted word to see whether
it is a valid word on the board, a valid word not on the board, or an
invalid word. With the current code, this averages 1ms.
But it also looks like you score once at the end (as you have to anyhow in
order to cancel out words found by multiple people, according to the rules
of Boggle).

WEBoggle is a little different than regular Boggle, in that your score
is the plain sum of the scores for all of the words that you found, with
no cancellation. I'm planning to add a "vs." feature eventually that
will involve cancellation, but even then I'll retain the immediate
feedback upon guessing a word.

In addition, many players enjoy seeing the list of "Words not found by
anyone".
(The other thing I can think of is that you are trying to verify that the
board contains some minimal number of words, in which case I submit that
boards with only 20-ish words is just part of the game :) I've never sat
down and really studied the Boggle dice, but I've always expected/hoped
that there is at least one or two dice with all vowels; even so the odds
of no vowels are small and easily algorithmically discarded. )

Believe it or not, before I added code to filter them out, I was
generating enough boards with *zero* valid words on them to get complaints.
Also, entirely separate plug, you may be interested in my XBLinJS project

Very nifty! Well beyond my current needs, but good to know about.

Cheers,

Evan @ 4-am
 
E

Evan Simpson

Jon said:
If the consumer and the producer are separate threads,
why does the consumer thread block when the producer
thread is generating a new board? Or why does it
take forever for the producer thread to be pre-empted?

Also, I don't understand why the solution works.
How does sleeping for .001 seconds right after putting
a new board on the queue solve the problem?

I'm guessing at how things work, and may be totally wrong, but here's
what I think happens: In the Queue get() code, the consumer releases
the 'fsema' lock. Directly or indirectly, this wakes up and hands
control to the producer thread, which was blocked trying to acquire
'fsema'. The sleep() hands control back to the scheduler immediately,
which appears to wake up the consumer and let it get on with things.

It doesn't take "forever" for the producer to be preempted, just the
normal preemption interval. I was bothered, though, by the idea of
having it take even a few dozen milliseconds out of the middle of a
request that's at most a millisecond or two away from finishing anyway.

Cheers,

Evan @ 4-am
 
E

Evan Simpson

Nick said:
To me, it smells like a call into a C extension which isn't releasing
the GIL before starting a time-consuming operation.

Sorry, I think I misled you both -- "a long time" is actually the few
dozen milliseconds of a normal preemption, which is only long relative
to the single millisecond that the consumer typically takes to handle a
request.

Cheers,

Evan @ 4-am
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top