subprocess and non-blocking IO (again)

Marc Carter · Oct 10, 2005

I am trying to rewrite a PERL automation which started a "monitoring"
application on many machines, via RSH, and then multiplexed their
collective outputs to stdout.

In production there are lots of these subprocesses but here is a
simplified example what I have so far (python n00b alert!)
- SNIP ---------
import subprocess,select,sys

speakers=[]
lProc=[]

for machine in ['box1','box2','box3']:
p = subprocess.Popen( ('echo '+machine+';sleep 2;echo goodbye;sleep
2;echo cruel;sleep 2;echo world'), stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, stdin=None, universal_newlines=True )
lProc.append( p )
speakers.append( p.stdout )

while speakers:
speaking = select.select( speakers, [], [], 1000 )[0]
for speaker in speaking:
speech = speaker.readlines()
if speech:
for sentence in speech:
print sentence.rstrip('\n')
sys.stdout.flush() # sanity check
else: # EOF
speakers.remove( speaker )
- SNIP ---------
The problem with the above is that the subprocess buffers all its output
when used like this and, hence, this automation is not informing me of
much

In PERL, "realtime" feedback was provided by setting the following:
$p->stdout->blocking(0);

How do I achieve this in Python ?

This topic seems to have come up more than once. I am hoping that
things have moved on from posts like this:
http://groups.google.com/group/comp...b471009ab2?q=blocking&rnum=4#434fa9b471009ab2
as I don't really want to have to write all that ugly
fork/dup/fcntl/exec code to achieve this when high-level libraries like
"subprocess" really should have corresponding methods.

If it makes anything simpler, I only *need* this on Linux/Unix (Windows
would be a nice extra though).

thanks for reading,
Marc

Donn Cave · Oct 10, 2005

I am trying to rewrite a PERL automation which started a "monitoring"
application on many machines, via RSH, and then multiplexed their
collective outputs to stdout.

In production there are lots of these subprocesses but here is a
simplified example what I have so far (python n00b alert!)
- SNIP ---------
import subprocess,select,sys

speakers=[]
lProc=[]

for machine in ['box1','box2','box3']:
p = subprocess.Popen( ('echo '+machine+';sleep 2;echo goodbye;sleep
2;echo cruel;sleep 2;echo world'), stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, stdin=None, universal_newlines=True )
lProc.append( p )
speakers.append( p.stdout )

while speakers:
speaking = select.select( speakers, [], [], 1000 )[0]
for speaker in speaking:
speech = speaker.readlines()
if speech:
for sentence in speech:
print sentence.rstrip('\n')
sys.stdout.flush() # sanity check
else: # EOF
speakers.remove( speaker )
- SNIP ---------
The problem with the above is that the subprocess buffers all its output
when used like this and, hence, this automation is not informing me of
much

You're using C stdio, through the Python fileobject. This is
sort of subprocess' fault, for returning a fileobject in the
first place, but in any case you can expect your input to be
buffered. You're asking for it, because that's what C stdio does.
When you call readlines(), you're further guaranteeing that you
won't go on to the next statement until the fork dies and its
pipe closes, because that's what readlines() does -- returns
_all_ lines of output.

If you want to use select(), don't use the fileobject
functions. Use os.read() to read data from the pipe's file
descriptor (p.stdout.fileno().) This is how you avoid the
buffering.

This topic seems to have come up more than once. I am hoping that
things have moved on from posts like this:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/5472ce95e
b430002/434fa9b471009ab2?q=blocking&rnum=4#434fa9b471009ab2
as I don't really want to have to write all that ugly
fork/dup/fcntl/exec code to achieve this when high-level libraries like
"subprocess" really should have corresponding methods.

subprocess doesn't have pty functionality. It's hard to say
for sure who said what in that page, after the incredible mess
Google has made of their USENET archives, but I believe that's
why you see dup2 there - the author is using a pty library,
evidently pexpect. As far as I know, things have not moved on
in this respect, not sure what kind of movement you expected
to see in the intervening month. I don't think you need ptys,
though, so I wouldn't worry about it.

Donn Cave, (e-mail address removed)

Nick Craig-Wood · Oct 11, 2005

Marc Carter said:
import subprocess,select,sys

speakers=[]
lProc=[]

for machine in ['box1','box2','box3']:
p = subprocess.Popen( ('echo '+machine+';sleep 2;echo goodbye;sleep
2;echo cruel;sleep 2;echo world'), stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, stdin=None, universal_newlines=True )
lProc.append( p )
speakers.append( p.stdout )

while speakers:
speaking = select.select( speakers, [], [], 1000 )[0]
for speaker in speaking:
speech = speaker.readlines()
if speech:
for sentence in speech:
print sentence.rstrip('\n')
sys.stdout.flush() # sanity check
else: # EOF
speakers.remove( speaker )
- SNIP ---------
The problem with the above is that the subprocess buffers all its output
when used like this and, hence, this automation is not informing me of
much

The problem with the above is that you are calling speaker.readlines()
which waits for all the output.

If you replace that with speaker.readline() or speaker.read(1) you'll
see that subprocess hasn't given you a buffered pipe after all!

In fact you'll get partial reads of each line - you'll have to wait
for a newline before processing the result, eg

import subprocess,select,sys

speakers=[]
lProc=[]

for machine in ['box1','box2','box3']:
p = subprocess.Popen( ('echo '+machine+';sleep 2;echo goodbye;sleep 2;echo cruel;sleep 2;echo world'), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, stdin=None, universal_newlines=True, shell=True)
lProc.append( p )
speakers.append( p.stdout )

while speakers:
speaking = select.select( speakers, [], [], 1000 )[0]
for speaker in speaking:
speech = speaker.readline()
if speech:
for sentence in speech:
print sentence.rstrip('\n')
sys.stdout.flush() # sanity check
else: # EOF
speakers.remove( speaker )

gives

b
o
x
1

b
o
x
3

b
o
x
2

pause...

g
o
o
d
b
y
e

etc...

I'm not sure why readline only returns 1 character - the pipe returned
by subprocess really does seem to be only 1 character deep which seems
a little inefficient! Changing bufsize to the Popen call doesn't seem
to affect it.

Marc Carter · Oct 11, 2005

Donn said:
If you want to use select(), don't use the fileobject
functions. Use os.read() to read data from the pipe's file
descriptor (p.stdout.fileno().) This is how you avoid the
buffering.

Thankyou, this works perfectly. I figured it would be something simple.

Marc

Thomas Bellman · Oct 11, 2005

Marc Carter said:
The problem with the above is that the subprocess buffers all its output
when used like this and, hence, this automation is not informing me of
much

You may want to take a look at my asyncproc module. With it, you
can start subprocesses and let them run in the background without
blocking either the subprocess or your own process, while still
collecting their output.

You can download it from

http://www.lysator.liu.se/~bellman/download/asyncproc.py

I suspect that it doesn't work under MS Windows, but I don't use
that OS, and thus can't test it.

subprocess module and blocking	2	Jun 12, 2005
Non-blocking pipes during subprocess handling	3	Jan 9, 2007

subprocess and non-blocking IO (again)

Marc Carter

Donn Cave

Nick Craig-Wood

Marc Carter

Thomas Bellman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads