bash-style pipes in python?

  • Thread starter Dan Stromberg - Datallegro
  • Start date
D

Dan Stromberg - Datallegro

I'm constantly flipping back and forth between bash and python.

Sometimes, I'll start a program in one, and end up recoding in the
other, or including a bunch of python inside my bash scripts, or snippets
of bash in my python.

But what if python had more of the power of bash-style pipes? I might not
need to flip back and forth so much. I could code almost entirely in python.

The kind of thing I do over and over in bash looks like:

#!/usr/bin/env bash

# exit on errors, like python. Exit on undefind variables, like python.
set -eu

# give the true/false value of the last false command in a pipeline
# not the true/false value of the lat command in the pipeline - like
# nothing I've seen
set -o pipefail

# save output in "output", but only echo it to the screen if the command fails
if ! output=$(foo | bar 2>&1)
then
echo "$0: foo | bar failed" 1>&2
echo "$output" 1>&2
exit 1
fi

Sometimes I use $PIPESTATUS too, but not that much.

I'm aware that python has a variety of pipe handling support in its
standard library.

But is there a similarly-simple way already, in python, of hooking the stdout of
process foo to the stdin of process bar, saving the stdout and errors from both
in a variable, and still having convenient access to process exit values?

Would it be possible to overload | (pipe) in python to have the same behavior as in
bash?

I could deal with slightly more cumbersome syntax, like:

(stdout, stderrs, exit_status) = proc('foo') | proc('bar')

....if the basic semantics were there.

How about it? Has someone already done this?
 
F

faulkner

I'm constantly flipping back and forth between bash and python.

Sometimes, I'll start a program in one, and end up recoding in the
other, or including a bunch of python inside my bash scripts, or snippets
of bash in my python.

But what if python had more of the power of bash-style pipes? I might not
need to flip back and forth so much. I could code almost entirely in python.

The kind of thing I do over and over in bash looks like:

#!/usr/bin/env bash

# exit on errors, like python. Exit on undefind variables, like python.
set -eu

# give the true/false value of the last false command in a pipeline
# not the true/false value of the lat command in the pipeline - like
# nothing I've seen
set -o pipefail

# save output in "output", but only echo it to the screen if the command fails
if ! output=$(foo | bar 2>&1)
then
echo "$0: foo | bar failed" 1>&2
echo "$output" 1>&2
exit 1
fi

Sometimes I use $PIPESTATUS too, but not that much.

I'm aware that python has a variety of pipe handling support in its
standard library.

But is there a similarly-simple way already, in python, of hooking the stdout of
process foo to the stdin of process bar, saving the stdout and errors from both
in a variable, and still having convenient access to process exit values?

Would it be possible to overload | (pipe) in python to have the same behavior as in
bash?

I could deal with slightly more cumbersome syntax, like:

(stdout, stderrs, exit_status) = proc('foo') | proc('bar')

...if the basic semantics were there.

How about it? Has someone already done this?

class P(subprocess.Popen):
def __or__(self, otherp):
otherp.stdin.write(self.stdout.read())
otherp.stdin.close()
return otherp
def __init__(self, cmd, *a, **kw):
for s in ['out', 'in', 'err']: kw.setdefault('std' + s, -1)
subprocess.Popen.__init__(self, cmd.split(), *a, **kw)

print (P('cat /etc/fstab') | P('grep x')).stdout.read()

of course, you don't need to overload __init__ at all, and you can
return otherp.stdout.read() instead of otherp, and you can make
__gt__, __lt__ read and write files. unfortunately, you can't really
fudge &>, >>, |&, or any of the more useful pipes, but you can make
more extensive use of __or__:

class Pipe:
def __or__(self, other):
if isinstance(other, Pipe): return ...
elif isinstance(other, P): return ...
def __init__(self, pipe_type): ...

k = Pipe(foo)
m = Pipe(bar)

P() |k| P()
P() |m| P()
 
D

Dan Stromberg - Datallegro

I'm constantly flipping back and forth between bash and python.

Sometimes, I'll start a program in one, and end up recoding in the
other, or including a bunch of python inside my bash scripts, or snippets
of bash in my python.

But what if python had more of the power of bash-style pipes? I might not
need to flip back and forth so much. I could code almost entirely in python.

The kind of thing I do over and over in bash looks like:

#!/usr/bin/env bash

# exit on errors, like python. Exit on undefind variables, like python.
set -eu

# give the true/false value of the last false command in a pipeline
# not the true/false value of the lat command in the pipeline - like
# nothing I've seen
set -o pipefail

# save output in "output", but only echo it to the screen if the command fails
if ! output=$(foo | bar 2>&1)
then
echo "$0: foo | bar failed" 1>&2
echo "$output" 1>&2
exit 1
fi

Sometimes I use $PIPESTATUS too, but not that much.

I'm aware that python has a variety of pipe handling support in its
standard library.

But is there a similarly-simple way already, in python, of hooking the stdout of
process foo to the stdin of process bar, saving the stdout and errors from both
in a variable, and still having convenient access to process exit values?

Would it be possible to overload | (pipe) in python to have the same behavior as in
bash?

I could deal with slightly more cumbersome syntax, like:

(stdout, stderrs, exit_status) = proc('foo') | proc('bar')

...if the basic semantics were there.

How about it? Has someone already done this?

class P(subprocess.Popen):
def __or__(self, otherp):
otherp.stdin.write(self.stdout.read())
otherp.stdin.close()
return otherp
def __init__(self, cmd, *a, **kw):
for s in ['out', 'in', 'err']: kw.setdefault('std' + s, -1)
subprocess.Popen.__init__(self, cmd.split(), *a, **kw)

print (P('cat /etc/fstab') | P('grep x')).stdout.read()

of course, you don't need to overload __init__ at all, and you can
return otherp.stdout.read() instead of otherp, and you can make
__gt__, __lt__ read and write files. unfortunately, you can't really
fudge &>, >>, |&, or any of the more useful pipes, but you can make
more extensive use of __or__:

class Pipe:
def __or__(self, other):
if isinstance(other, Pipe): return ...
elif isinstance(other, P): return ...
def __init__(self, pipe_type): ...

k = Pipe(foo)
m = Pipe(bar)

P() |k| P()
P() |m| P()

This is quite cool.

I have to ask though - isn't this going to read everything from foo until
foo closes its stdout, and then write that result to bar - rather than
doing it a block at a time like a true pipe?

I continued thinking about this last night after I sent my post, and
started wondering if it might be possible to come up with something where
you could freely intermix bash pipes and python generators, by faking the
pipes using generators and some form of concurrent function composition -
perhaps threads or subprocesses for the concurrency. Of course, that
imposes some extra I/O, CPU and context switch overhead, since you'd have
python shuttling data from foo to bar all the time instead of foo sending
data directly to bar, but the ability to mix python and pipes might be
worth it.

Even better might be to have a choice between intermixable and fast.

Comments anyone?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top