Filtering through an external process

P

Paul Rubin

Anyone know if there's code around to filter text through an external
process? Sort of like the Emacs "filter-region" command. For
example, say I have a program that reads input in English and outputs
it in Pig Latin. I want my Python script to call the program, pipe
some input into it and read the output:

english = "hello world"
pig_latin = ext_filter("pig_latin", english)

should set pig_latin to "ellohay orldway".

Note that you can't just call popen2, jam the english through it and
then read the pig latin, because the subprocess can block if you give
it too much input before reading the output, and in general there's no
way to know how much buffering the subprocess is willing to do. So a
proper solution has to use asynchronous i/o and keep polling the
output side, or else separate threads for reading and writing.

This is something that really belongs in the standard library. I've
needed it several times and rather than going to the trouble of coding
and debugging it, I've always ended up using a temp file instead,
which is a kludge.
 
R

Raymond Hettinger

Paul Rubin said:
Anyone know if there's code around to filter text through an external
process? Sort of like the Emacs "filter-region" command. For
example, say I have a program that reads input in English and outputs
it in Pig Latin. I want my Python script to call the program, pipe
some input into it and read the output:

english = "hello world"
pig_latin = ext_filter("pig_latin", english)

should set pig_latin to "ellohay orldway".

Note that you can't just call popen2, jam the english through it and
then read the pig latin, because the subprocess can block if you give
it too much input before reading the output, and in general there's no
way to know how much buffering the subprocess is willing to do. So a
proper solution has to use asynchronous i/o and keep polling the
output side, or else separate threads for reading and writing.

This is something that really belongs in the standard library. I've
needed it several times and rather than going to the trouble of coding
and debugging it, I've always ended up using a temp file instead,
which is a kludge.

The time machine lives!

=========================
Add this file: Lib/encodings/pig.py
----------------------------------------
"Pig Latin Codec -- Lib/encodings/pig.py"

import codecs, re

def encode(input, errors='strict'):
output = re.sub( r'\b(th|ch|st|\w)(\w+)\b', r'\2\1ay', input)
return (output, len(input))
def decode(input, errors='strict'):
output = re.sub( r'(\b\w+?)(th|ch|st|\w)ay\b', r'\2\1', input)
return (output, len(input))

def getregentry():
return (encode,decode,codecs.StreamReader,codecs.StreamWriter)
-------------------------------------------


Now, fire-up Python:
'hello world'



Raymond Hettinger
 
P

Paul Rubin

Raymond Hettinger said:
The time machine lives!

=========================
Add this file: Lib/encodings/pig.py

Chuckle :). But I had in mind a more general purpose means of running
external processes.
 
S

Scott David Daniels

Paul said:
Anyone know if there's code around to filter text through an external
process? Sort of like the Emacs "filter-region" command. For
Check out popen2 -- its the piece you need.

dest, result = os.popen2('cmd')
dest.write('echo Hello world\n')
dest.write('exit\n')
dest.close()
result.read()


So, perhaps you mean:
import os

def filtered(command, source):
dest, result = os.popen2(command)
dest.write(source)
dest.close()
try:
return result.read()
finally:
result.close()


-Scott David Daniels
(e-mail address removed)
 
P

Paul Rubin

Scott David Daniels said:
Check out popen2 -- its the piece you need.

No, that doesn't do the job. If you popen2 a process and send too
much input without reading the output, the subprocess will block and
your application will hang. That is explained in the docs. Doing it
right is a little bit complicated. You need threads or asynchronous
i/o. That's the functionality that's missing.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,175
Messages
2,570,942
Members
47,476
Latest member
blackwatermelon

Latest Threads

Top