New line conversion with Popen attached to a pty

J

jfharden

Hi,

Sorry if this appears twice, I sent it to the mailing list earlier and the mail seems to have been swallowed by the black hole of email vagaries.

We have a class which executes external processes in a controlled environment and does "things" specified by the client program with each line of output. To do this we have been attaching stdout from the subprocess.Popen to apseudo terminal (pty) made with pty.openempty and opened with os.fdopen. Inoticed that we kept getting a bunch of extra new line characters.

This is all using python 2.6.4 in a centos6 environment.

After some investigation I realised we needed to use universal_newline support so I enabled it for the Popen and specified the mode in the fdopen to be rU. Things still seemed to be coming out wrong so I wrote up a test program boiling it down to the simplest cases (which is at the end of this message). The output I was testing was this:

Fake\r\nData\r\n
as seen through hexdump -C:
hexdump -C output.txt
00000000 46 61 6b 65 0d 0a 44 61 74 61 0d 0a |Fake..Data..|
0000000c

Now if I do a simple subprocess.Popen and set the stdout to subprocess.PIPE, then do p.stdout.read() I get the correct output of

Fake\nData\n

When do the Popen attached to a pty I end up with

Fake\n\nData\n\n

Does anyone know why the newline conversion would be incorrect, and what I could do to fix it? In fact if anyone even has any pointers to where this might be going wrong I'd be very helpful, I've done hours of fiddling with this and googling to no avail.

One liner to generate the test data:

python -c 'f = open("output.txt", "w"); f.write("Fake\r\nData\r\n"); f.close()'

Test script:

#!/usr/bin/env python2.6.4
import os
import pty
import subprocess
import select
import fcntl

class TestRead(object):

def __init__(self):
super(TestRead, self).__init__()
self.outputPipe()
self.outputPty()

def outputPipe(self):
p1 = subprocess.Popen(
("/bin/cat", "output.txt"),
stdout=subprocess.PIPE,
universal_newlines=True
)
print "1: %r" % p1.stdout.read()

def outputPty(self):
outMaster, outSlave = pty.openpty()
fcntl.fcntl(outMaster, fcntl.F_SETFL, os.O_NONBLOCK)

p2 = subprocess.Popen(
("/bin/cat", "output.txt"),
stdout=outSlave,
universal_newlines=True
)

with os.fdopen(outMaster, 'rU') as pty_stdout:
while True:
try:
rfds, _, _ = select.select([pty_stdout], [], [], 0.1)
break
except select.error:
continue

for fd in rfds:
buf = pty_stdout.read()
print "2: %r" % buf

if __name__ == "__main__":
t = TestRead()

Thanks,

Jonathan
 
P

Peter Otten

Hi,

Sorry if this appears twice, I sent it to the mailing list earlier and the
mail seems to have been swallowed by the black hole of email vagaries.

We have a class which executes external processes in a controlled
environment and does "things" specified by the client program with each
line of output. To do this we have been attaching stdout from the
subprocess.Popen to a pseudo terminal (pty) made with pty.openempty and
opened with os.fdopen. I noticed that we kept getting a bunch of extra new
line characters.

Mixing subprocess and explicit select() looks a bit odd to me. Perhaps you
should do it completely without subprocess. Did you consider pexpect?
This is all using python 2.6.4 in a centos6 environment.

After some investigation I realised we needed to use universal_newline
support so I enabled it for the Popen and specified the mode in the fdopen
to be rU. Things still seemed to be coming out wrong so I wrote up a test
program boiling it down to the simplest cases (which is at the end of this
message). The output I was testing was this:

Fake\r\nData\r\n
as seen through hexdump -C:
hexdump -C output.txt
00000000 46 61 6b 65 0d 0a 44 61 74 61 0d 0a |Fake..Data..|
0000000c

Now if I do a simple subprocess.Popen and set the stdout to
subprocess.PIPE, then do p.stdout.read() I get the correct output of

Fake\nData\n

When do the Popen attached to a pty I end up with

Fake\n\nData\n\n

Does anyone know why the newline conversion would be incorrect, and what I
could do to fix it? In fact if anyone even has any pointers to where this
might be going wrong I'd be very helpful, I've done hours of fiddling with
this and googling to no avail.

One liner to generate the test data:

python -c 'f = open("output.txt", "w"); f.write("Fake\r\nData\r\n");
f.close()'

Test script:

#!/usr/bin/env python2.6.4
import os
import pty
import subprocess
import select
import fcntl

class TestRead(object):

def __init__(self):
super(TestRead, self).__init__()
self.outputPipe()
self.outputPty()

def outputPipe(self):
p1 = subprocess.Popen(
("/bin/cat", "output.txt"),
stdout=subprocess.PIPE,
universal_newlines=True
)
print "1: %r" % p1.stdout.read()

def outputPty(self):
outMaster, outSlave = pty.openpty()
fcntl.fcntl(outMaster, fcntl.F_SETFL, os.O_NONBLOCK)

p2 = subprocess.Popen(
("/bin/cat", "output.txt"),
stdout=outSlave,
universal_newlines=True
)

with os.fdopen(outMaster, 'rU') as pty_stdout:
while True:
try:
rfds, _, _ = select.select([pty_stdout], [], [], 0.1)
break
except select.error:
continue

for fd in rfds:
buf = pty_stdout.read()
print "2: %r" % buf

if __name__ == "__main__":
t = TestRead()

The "universal newlines" translation happens on the python level whereas the
subprocesses communicate via OS means (pipes). Your pty gets "\r\n", leaves
"\r" as is and replaces "\n" with "\r\n". You end up with "\r\r\n" which is
interpreted by "universal newlines" mode as a Mac newline followed by a DOS
newline.

I see two approaches to fix the problem:

(1) Add an intermediate step to change newlines explicitly:

p = subprocess.Popen(
["/bin/cat", "output.txt"],
stdout=subprocess.PIPE
)
q = subprocess.Popen(
# ["dos2unix"],
["python", "-c", "import sys, os;
sys.stdout.writelines(os.fdopen(sys.stdin.fileno(), 'rU'))"],
stdin=p.stdout,
stdout=outSlave)

(2) Fiddle with terminal options, e. g.

attrs = termios.tcgetattr(outSlave)
attrs[1] = attrs[1] & (~termios.ONLCR) | termios.ONLRET
termios.tcsetattr(outSlave, termios.TCSANOW, attrs)

p = subprocess.Popen(
("/bin/cat", "output.txt"),
stdout=outSlave,
)


Disclaimer: I found this by try-and-error, so it may not be the "proper"
way.
 
J

jfharden

(2) Fiddle with terminal options, e. g.

attrs = termios.tcgetattr(outSlave)
attrs[1] = attrs[1] & (~termios.ONLCR) | termios.ONLRET
termios.tcsetattr(outSlave, termios.TCSANOW, attrs)

p = subprocess.Popen(
("/bin/cat", "output.txt"),
stdout=outSlave,
)

Disclaimer: I found this by try-and-error, so it may not be the "proper"
way.

Thank you! That is absolutely perfect. I had read about pty options but didn't think to read about tty options. If we ever meet in real life remind me to buy you a beverage of your choosing.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,690
Latest member
MacGyver

Latest Threads

Top