How to read large amounts of output via popen

L

loial

I need to read a large amount of data that is being returned in
standard output by a shell script I am calling.

(I think the script should really be writing to a file but I have no
control over that)

Currently I have the following code. It seeems to work, however I
suspect this may not work with large amounts of standard output.

What is the best way to read a large amount of data from standard
output and write to a file?

Here is my code.

process=subprocess.Popen(['myscript', 'param1'],
shell=False,stdout=subprocess.PIPE,stderr=subprocess.PIPE)

cmdoutput=process.communicate()

myfile = open('/home/john/myoutputfile','w')

myfile.write(cmdoutput[0])

myfile.close()
 
G

Gabriel Genellina

I need to read a large amount of data that is being returned in
standard output by a shell script I am calling.

(I think the script should really be writing to a file but I have no
control over that)

Currently I have the following code. It seeems to work, however I
suspect this may not work with large amounts of standard output.

What is the best way to read a large amount of data from standard
output and write to a file?

Here is my code.

process=subprocess.Popen(['myscript', 'param1'],
shell=False,stdout=subprocess.PIPE,stderr=subprocess.PIPE)

cmdoutput=process.communicate()

myfile = open('/home/john/myoutputfile','w')

myfile.write(cmdoutput[0])

myfile.close()


If all you do with the process' output is to write it to the output file,
you can avoid the intermediate step:


myfile = open('/home/john/myoutputfile','w')
myerror = open('/home/john/myerrorfile','w')
process=subprocess.Popen(['myscript', 'param1'],
shell=False,stdout=myfile,stderr=myerror)
process.wait()

(untested)
 
N

Nobody

I need to read a large amount of data that is being returned in
standard output by a shell script I am calling.

(I think the script should really be writing to a file but I have no
control over that)

If the script is writing to stdout, you get to decide whether its stdout
is a pipe, file, tty, etc.
Currently I have the following code. It seeems to work, however I
suspect this may not work with large amounts of standard output.
process=subprocess.Popen(['myscript', 'param1'],
shell=False,stdout=subprocess.PIPE,stderr=subprocess.PIPE)

cmdoutput=process.communicate()

It's certainly not the best way to read large amounts of output.
Unfortunately, better solutions get complicated when you need to read more
than one of stdout and stderr, or if you also need to write to stdin.

If you only need stdout, you can just read from process.stdout in a loop.
You can leave stderr going to wherever the script's stderr goes (e.g. the
terminal), or redirect it to a file.

If you really do need both stdout and stderr, then you either need to
enable non-blocking I/O, or use a separate thread for each stream, or
redirect at least one of them to a file.

FWIW, Popen.communicate() uses non-blocking I/O on Unix and separate
threads on Windows (the standard library doesn't include a mechanism to
enable non-blocking I/O on Windows).
What is the best way to read a large amount of data from standard
output and write to a file?

For this case, the best way is to just redirect stdout to a file, rather
than passing it through the script, i.e.:

outfile = open('outputfile', 'w')
process = subprocess.call(..., stdout = outfile)
outfile.close()
 
L

loial

Ok, thats great. Thanks for the very elegant solution(s)




I need to read a large amount of data that is being returned in
standard output by a shell script I am calling.
(I think the script should really be writing to a file but I have no
control over that)

If the script is writing to stdout, you get to decide whether its stdout
is a pipe, file, tty, etc.
Currently I have the following code. It seeems to work, however I
suspect this may not work with large amounts of standard output.
process=subprocess.Popen(['myscript', 'param1'],
shell=False,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
cmdoutput=process.communicate()

It's certainly not the best way to read large amounts of output.
Unfortunately, better solutions get complicated when you need to read more
than one of stdout and stderr, or if you also need to write to stdin.

If you only need stdout, you can just read from process.stdout in a loop.
You can leave stderr going to wherever the script's stderr goes (e.g. the
terminal), or redirect it to a file.

If you really do need both stdout and stderr, then you either need to
enable non-blocking I/O, or use a separate thread for each stream, or
redirect at least one of them to a file.

FWIW, Popen.communicate() uses non-blocking I/O on Unix and separate
threads on Windows (the standard library doesn't include a mechanism to
enable non-blocking I/O on Windows).
What is the best way to read a large amount of data from standard
output and write to a file?

For this case, the best way is to just redirect stdout to a file, rather
than passing it through the script, i.e.:

        outfile = open('outputfile', 'w')
        process = subprocess.call(..., stdout = outfile)
        outfile.close()
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,154
Members
46,702
Latest member
LukasConde

Latest Threads

Top