subprocess.Popen pipeline bug?

M

Marko Rauhamaa

This tiny program hangs:

========================================================================
#!/usr/bin/env python
import subprocess
a = subprocess.Popen('cat',shell = True,stdin = subprocess.PIPE,
stdout = subprocess.PIPE)
b = subprocess.Popen('cat >/dev/null',shell = True,stdin = a.stdout)
a.stdin.close()
b.wait() # hangs
a.wait() # never reached
========================================================================

It shouldn't, should it?

Environment:
========================================================================
Python 2.5.1 (r251:54863, Jun 20 2007, 12:14:09)
[GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux2
========================================================================


Marko
 
B

bryanjugglercryptographer

Marko said:
This tiny program hangs:

========================================================================
#!/usr/bin/env python
import subprocess
a = subprocess.Popen('cat',shell = True,stdin = subprocess.PIPE,
stdout = subprocess.PIPE)
b = subprocess.Popen('cat >/dev/null',shell = True,stdin = a.stdout)
a.stdin.close()
b.wait() # hangs
a.wait() # never reached
========================================================================

To make it work, add close_fds=True in the Popen that creates b.
It shouldn't, should it?

Not sure. I think what's happening is that the second cat subprocess
never gets EOF on its stdin, because there are still processes with
an open file descriptor for the other end of the pipe.

The Python program closes a.stdin, and let's suppose that's file
descriptor 4. That's not enough, because the subshell that ran cat and
the cat process itself inherited the open file descriptor 4 when they
forked off.

It looks like Popen is smart enough to close the extraneous
descriptors for pipes it created in the same Popen call, but that
one was created in a previous call and passed in.
 
D

Douglas Wells

This tiny program hangs:

========================================================================
#!/usr/bin/env python
import subprocess
a = subprocess.Popen('cat',shell = True,stdin = subprocess.PIPE,
stdout = subprocess.PIPE)
b = subprocess.Popen('cat >/dev/null',shell = True,stdin = a.stdout)
a.stdin.close()
b.wait() # hangs
a.wait() # never reached
========================================================================

It shouldn't, should it?

Yes, it should.

This issue is related to the subtleties of creating a pipeline in
POSIX environments. The problem is that the cat command in
subprocess a never completes because it never encounters an EOF
(on a.stdin). Even though you issue a close call (a.stdin.close ()),
you're not issuing the "last" close. That's because there is still
at least one file descriptor open in subprocess tree b. That
happened because it was open when the subprocess module executed
a POSIX fork call and it got duplicated as part of the fork call.

I don't see any clean and simple way to actually fix this. (That's
one of the reasons why POSIX shells are so complicated.) There
are a couple of work-arounds that you can use:

1) Force close-on-exec on the specific file descriptor:

import subprocess
a = subprocess.Popen('cat',shell = True,stdin = subprocess.PIPE,
stdout = subprocess.PIPE)
# ********* beginning of changes
import os, fcntl
fd = a.stdin.fileno ()
old = fcntl.fcntl (fd, fcntl.F_GETFD)
fcntl.fcntl (fd, fcntl.F_SETFD, old | fcntl.FD_CLOEXEC)
# ********* end of changes
b = subprocess.Popen('cat >/dev/null',shell = True,stdin = a.stdout)
a.stdin.close()
b.wait()
a.wait()

Or if it happens to not cause undesired side-effects for you, you can
2) Force close-on-exec *all* non-standard file descriptors by using
the close_fds argument to Popen:

import subprocess
a = subprocess.Popen('cat',shell = True,stdin = subprocess.PIPE,
stdout = subprocess.PIPE)
# ********* beginning of changes
# b = subprocess.Popen('cat >/dev/null',shell = True,stdin = a.stdout)
b = subprocess.Popen('cat >/dev/null',shell = True,stdin = a.stdout,
close_fds = True)
# ********* end of changes
a.stdin.close()
b.wait()
a.wait()

Good luck.

- dmw
 
M

Marko Rauhamaa

(e-mail address removed):
Not sure. I think what's happening is that the second cat subprocess
never gets EOF on its stdin, because there are still processes with an
open file descriptor for the other end of the pipe.

You are right. However, the close_fds technique seems a bit
heavy-handed. Well, that's what you get when you try to combine fork and
exec into a single call.


Marko
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,817
Latest member
AdalbertoT

Latest Threads

Top