Thread vs. generator problem

P

Paul Rubin

As I understand it, generators are supposed to run til they hit a
yield statement:

import time
def f():
print 1
time.sleep(3)
for i in range(2,5):
yield i

for k in f():
print k

prints "1" immediately, sleeps for 3 seconds, then prints 2, 3, and 4
without pausing, as expected. When I try to do it in a separate thread:

import time, itertools
def remote_iterate(iterator, cachesize=5):
# run iterator in a separate thread and yield its values
q = Queue.Queue(cachesize)
def f():
print 'thread started'
for x in iterator:
q.put(x)
threading.Thread(target=f).start()
while True:
yield q.get()

g = remote_iterate(itertools.count)
print 'zzz...'
time.sleep(3)
print 'hi'
for i in range(5):
print g.next()

I'd expect to see 'thread started' immediately, then 'zzz...', then a 3
second pause, then 'hi', then the numbers 0..4. Instead, the thread
doesn't start until the 3 second pause has ended.

When I move the yield statement out of remote_iterate's body and
instead have return a generator made in a new internal function, it
does what I expect:

import time, itertools
def remote_iterate(iterator, cachesize=5):
# run iterator in a separate thread and yield its values
q = Queue.Queue(cachesize)
def f():
print 'thread started'
for x in iterator:
q.put(x)
threading.Thread(target=f).start()
def g():
while True:
yield q.get()
return g()

Any idea what's up? Is there some race condition, where the yield
statement freezes the generator before the new thread has started? Or
am I just overlooking something obvious?

Thanks.
 
R

Robert Kern

Paul said:
As I understand it, generators are supposed to run til they hit a
yield statement:

import time
def f():
print 1
time.sleep(3)
for i in range(2,5):
yield i

for k in f():
print k

prints "1" immediately, sleeps for 3 seconds, then prints 2, 3, and 4
without pausing, as expected. When I try to do it in a separate thread:

import time, itertools
def remote_iterate(iterator, cachesize=5):
# run iterator in a separate thread and yield its values
q = Queue.Queue(cachesize)
def f():
print 'thread started'
for x in iterator:
q.put(x)
threading.Thread(target=f).start()
while True:
yield q.get()

g = remote_iterate(itertools.count)
print 'zzz...'
time.sleep(3)
print 'hi'
for i in range(5):
print g.next()

I'd expect to see 'thread started' immediately, then 'zzz...', then a 3
second pause, then 'hi', then the numbers 0..4. Instead, the thread
doesn't start until the 3 second pause has ended.

My 10-second analysis is that *none* the body of a generator runs until a value
is requested from it.

In [3]: def f():
...: print 'Starting f'
...: for i in range(3):
...: yield i
...:
...:

In [4]: g = f()

In [5]: for i in g:
...: print i
...:
...:
Starting f
0
1
2

In your first example, you instantiate the generator and then iterate over it
immediately; in your second, you separate the two things. I don't think threads
have anything to do with it.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
T

Tim Peters

[Paul Rubin]
...
When I try to do it in a separate thread:

import time, itertools
def remote_iterate(iterator, cachesize=5):
# run iterator in a separate thread and yield its values
q = Queue.Queue(cachesize)
def f():
print 'thread started'
for x in iterator:
q.put(x)
threading.Thread(target=f).start()
while True:
yield q.get()

g = remote_iterate(itertools.count)

You didn't run this code, right? itertools.count() was intended. In
any case, as when calling any generator, nothing in the body of
remote_iterate() is executed until the generator-iterator's next()
method is invoked. Nothing here does that. So, in particular, the
following is the _only_ line that can execute next:
print 'zzz...'

And then this line:
time.sleep(3)

And then this:
print 'hi'

And then this:
for i in range(5):

And then the first time you execute this line is the first time any
code in the body of remote_iterate() runs:
print g.next()

I'd expect to see 'thread started' immediately, then 'zzz...', then a 3
second pause, then 'hi', then the numbers 0..4. Instead, the thread
doesn't start until the 3 second pause has ended.

That's all as it must be.
When I move the yield statement out of remote_iterate's body and
instead have return a generator made in a new internal function, it
does what I expect:

import time, itertools
def remote_iterate(iterator, cachesize=5):

Note that remote_iterate() is no longer a generator, so its body is
executed as soon as it's called.
# run iterator in a separate thread and yield its values
q = Queue.Queue(cachesize)
def f():
print 'thread started'
for x in iterator:
q.put(x)
threading.Thread(target=f).start()

And so the thread starts when remote_iterate() is called.
def g():
while True:
yield q.get()
return g()

Any idea what's up? Is there some race condition, where the yield
statement freezes the generator before the new thread has started?
No.

Or am I just overlooking something obvious?

No, but it's not notably subtle either ;-)
 
P

Paul Rubin

Tim Peters said:
You didn't run this code, right? itertools.count() was intended.

Sorry, I made a cut-and-paste error posting the message. My test case
did use itertools.count().
In any case, as when calling any generator, nothing in the body of
remote_iterate() is executed until the generator-iterator's next()
method is invoked.

Ooof, I see what happened now. My first test case was misleading and
made me think that the generator immediately executed until it reached
a yield statement. In fact it was the generator function (i.e. the
thing that made the generator), a separate object, that printed the
messages.

Thanks!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top