Stumped: returning a read pipe from a function

kj · Aug 18, 2007

I'd like to write a function along these lines:

sub foo {
my @args = @_;
open my $in, '-|', '/some/command', @args or die $!;
return $in;
}

(The OS is Unix.)

The code above works fine when the arguments in @args are all OK.
Otherwise '/some/command' spits an error message to stderr and
dies, a failure that the code above fails to trap, so the routine
exits normally.

Putting an eval around the open statement fails to catch the error,
and certainly it does not prevent the error message from /some/command
from appearing in stderr.

The root of all these problems I'm sure has to do with the implicit
fork triggered by the '-|' argument to open.

My question is: how can I trap the error and the error message?

TIA!

kj

Gunnar Hjalmarsson · Aug 18, 2007

kj said:
I'd like to write a function along these lines:

sub foo {
my @args = @_;
open my $in, '-|', '/some/command', @args or die $!;
return $in;
}

(The OS is Unix.)

The code above works fine when the arguments in @args are all OK.
Otherwise '/some/command' spits an error message to stderr and
dies, a failure that the code above fails to trap, so the routine
exits normally.

My question is: how can I trap the error and the error message?

One idea:

sub foo {
my @args = @_;
open my $in, '-|', '/some/command', @args;
return $in if fileno $in;
close $in or die $!;
}

kj · Aug 18, 2007

I'd like to write a function along these lines:

sub foo {
my @args = @_;
open my $in, '-|', '/some/command', @args or die $!;
return $in;
}

(The OS is Unix.)

The code above works fine when the arguments in @args are all OK.
Otherwise '/some/command' spits an error message to stderr and
dies, a failure that the code above fails to trap, so the routine
exits normally.

Putting an eval around the open statement fails to catch the error,
and certainly it does not prevent the error message from /some/command
from appearing in stderr.

The root of all these problems I'm sure has to do with the implicit
fork triggered by the '-|' argument to open.

My question is: how can I trap the error and the error message?

Well, I'm getting close (I think). This is the best I've managed,
though it has at least a few problems:

use Socket;
use IO::Handle;
sub foo {
socketpair( my $child, my $parent, AF_UNIX,
SOCK_STREAM, PF_UNSPEC)
or die $!;

$_->autoflush( 1 ) for $child, $parent;

unless ( my $pid = fork ) {
die $! unless defined $pid;
close $child or die $!;

open STDOUT, '>&', $parent or die $!;

my $status = system "/some/command @_ 2>&1";

exit( $status ? 1 : 0 );
}
close $parent or die;
wait;

die join '', <$child> if $?;

return $child;
}

This seems to work, but the system() call requires the shell, due
to the "2>&1" redirection. Also, it's a lot of machinery to do
something that is rather simple, conceptually at least. Plus it
uses an explicit fork, which makes debugging with perldb a bit of
a pain.

Any other comments or suggestions about this code would be much
appreciated. TIA!

kj

kj · Aug 18, 2007

In said:
One idea:

sub foo {
my @args = @_;
open my $in, '-|', '/some/command', @args;
return $in if fileno $in;
close $in or die $!;
}

Thanks, but, as it happens, fileno( $in ) is positive even when
'/some/command' fails.

I had a bit better luck testing for eof( $in ), but this is not a
general solution, because there may be cases in which eof( $in )
evaluates to true even when the command was executed successfully.
Worse yet, even when the error is correctly detected, I still don't
have the error message...

kj

xhoster · Aug 19, 2007

kj said:
I'd like to write a function along these lines:

sub foo {
my @args = @_;
open my $in, '-|', '/some/command', @args or die $!;
return $in;
}

(The OS is Unix.)

The code above works fine when the arguments in @args are all OK.
Otherwise '/some/command' spits an error message to stderr and
dies, a failure that the code above fails to trap, so the routine
exits normally.

That is because the open doesn't fail. /some/command is successfully
started. That is all open cares about. At some later point /some/command
realizes it was given bogus arguments and it exits, but that is irrelevant
to the open--open cares about staring the program. Close worries about
how the program ended.

Putting an eval around the open statement fails to catch the error,

Of course. The error might not have even happened yet.

and certainly it does not prevent the error message from /some/command
from appearing in stderr.

You could redirect stderr to /dev/null if you wanted to do that.

The root of all these problems I'm sure has to do with the implicit
fork triggered by the '-|' argument to open.

Not really. The problem is that while the specific error you are
interested in in this case just happens to occur before the first line is
printed, that is not the case generally.

My question is: how can I trap the error and the error message?

The easiest way would be to have the command save it's output into
a temp file, check the programs exit status ($? after close), and if it
is OK, then open that temp file for reading, unlink it, and return that
file handle back from the sub.

Xho

Gunnar Hjalmarsson · Aug 19, 2007

kj said:
Thanks, but, as it happens, fileno( $in ) is positive even when
'/some/command' fails.

Ok, here is another try, where $in is a package global, and where you
explicitly close the handle in an END block.

our $in;

sub foo {
my @args = @_;
open my $in, '-|', '/some/command', @args or die $!;
}

END { close $in or die $? }

Gunnar Hjalmarsson · Aug 19, 2007

Gunnar said:
our $in;

sub foo {
my @args = @_;
open my $in, '-|', '/some/command', @args or die $!;
}

END { close $in or die $? }

Should of course be

open $in, ...

xhoster · Aug 19, 2007

kj said:
Well, I'm getting close (I think). This is the best I've managed,
though it has at least a few problems:

use Socket;
use IO::Handle;
sub foo {
socketpair( my $child, my $parent, AF_UNIX,
SOCK_STREAM, PF_UNSPEC)
or die $!;

$_->autoflush( 1 ) for $child, $parent;

unless ( my $pid = fork ) {
die $! unless defined $pid;
close $child or die $!;

open STDOUT, '>&', $parent or die $!;

my $status = system "/some/command @_ 2>&1";

exit( $status ? 1 : 0 );
}
close $parent or die;
wait;

This is liable to deadlock. The parent won't start reading until the
child has exited. The child won't exit until the spawned process
has exited. The spawned process won't exit until it is done printing.
And, if what it is printing is more than one buffer full, then it can't
finish printing until the parent starts reading. Round and round you go.

This seems to work, but the system() call requires the shell, due
to the "2>&1" redirection. Also, it's a lot of machinery to do
something that is rather simple, conceptually at least.

I think you are over-estimating the conceptual simplicity of what you
are trying to do.

Xho

comp.llang.perl.moderated · Aug 19, 2007

I'd like to write a function along these lines:

sub foo {
my @args = @_;
open my $in, '-|', '/some/command', @args or die $!;
return $in;
}

(The OS is Unix.)

The code above works fine when the arguments in @args are all OK.
Otherwise '/some/command' spits an error message to stderr and
dies, a failure that the code above fails to trap, so the routine
exits normally.

Putting an eval around the open statement fails to catch the error,
and certainly it does not prevent the error message from /some/command
from appearing in stderr.

The root of all these problems I'm sure has to do with the implicit
fork triggered by the '-|' argument to open.

My question is: how can I trap the error and the error message?

The suggested IPC::Open3 sounds like the best
option. But, if you just need to catch a crib
death, ie., /some/command checks arg's first
and dies immediately if invalid:

open my $in, '-|', '/some/command 2>&1', @args
or die $!;

my $line = <$in>;
if ( eof($in) ) {
die $line, " close error:", $! || $?
unless close($in);
}

return $in, $line;

You would need to modify the subroutine foo
to return the line plucked from <$in> too.

kj · Aug 20, 2007

In said:
open my $in, '-|', '/some/command 2>&1', @args
or die $!;

my $line = <$in>;
if ( eof($in) ) {
die $line, " close error:", $! || $?
unless close($in);
}

return $in, $line;

Well, of all the bazillion variants I've coded and tested in the
last couple of days, the one based on this idea is the one that
works the best. Thank you very much.

Still, there's something a bit odd about the resulting routine:

use IPC::Open3 'open3';
use IO::Unread 'unread';

sub foo {
my $pid =
open3( my ( $wtr, $rdr, $err ), '/some/command', @_ );

close $wtr;

if ( defined ( my $first_line = <$rdr> ) ) {
die $first_line if eof( $rdr );
unread $rdr, $first_line;
}
return $rdr;
}

The oddity is that $err is not used! Inexplicably, if I use open3,
errors appear via $rdr, not via $err, as one would expect. (For
the record, when I run '/some/command' from the command line the
behavior is exactly what one would reasonably expect: errors get
sent to stderr, and normal output gets sent to stdout.)

Furthermore, given the behavior I just described for open3, one
would expect that open2 would work just as well, but this is not
the case. When I use open2, the error goes to STDERR, which now
is not being caught!

This is crazy.

(If anyone has a clue as to what could be going on here, I'd love
to read about it. FWIW, the OS here is Linux.)

Lastly, thanks for all your comments and suggestions. They were
very helpful.

kj

xhoster · Aug 21, 2007

kj said:
use IPC::Open3 'open3';
use IO::Unread 'unread';

Boy are you ever piling on the complications here.

sub foo {
my $pid =
open3( my ( $wtr, $rdr, $err ), '/some/command', @_ );

close $wtr;

if ( defined ( my $first_line = <$rdr> ) ) {
die $first_line if eof( $rdr );
unread $rdr, $first_line;
}
return $rdr;
}

The oddity is that $err is not used! Inexplicably, if I use open3,
errors appear via $rdr, not via $err, as one would expect.

From the docs:

If ERRFH is false, or the same file
descriptor as RDRFH, then STDOUT and STDERR of the child
are on the same filehandle.

In your case the newly my'ed $err, which is the argument in the ERRFH slot,
is false. Sad, but true.

Xho

kj · Aug 21, 2007

Boy are you ever piling on the complications here.

The complication is not entirely obvious to me, but at any rate
well, I'm open to suggestions... Your idea of saving the entire
output of the command to a file is unpalatable because this output
*can* be huge and the function downstream may need only the first
few hundred lines of this huge output. It just so happens that,
by far, the most common possible outcomes for '/some/command' are
either success with a many lines of output or immediate failure
with a one-line error message. Hence, the idea of checking for
eof after the first line happens to work in this special case.
But the downstream function requires a filehandle (as opposed to
a pair of handle and string), hence the business with unread.

From the docs:

<the rest snipped to ease my pain>

To say that there's egg on my face would be an understatement...
I had in fact read that in the docs not too long ago, and (since
it did not make much sense at the time) I immediately forgot it.
Thanks for the reminder.

kj

xhoster · Aug 21, 2007

kj said:
The complication is not entirely obvious to me,

As long as only one part of the code is being clever, you are OK.
But when many parts of the code are being clever, they start tripping
over each other. IO::Unread hiddenly ties and unties file handles (at
least in some situations), so if the whatever you returning the handle to
also, either now or in the future, wants to tie it or get the fd and pass
that out to some other program, it will fail. That is the complexity. I
usually consider things like IO::Unread and Tie::File to be emergency
life-support methods. If I need them, it probably means the code isn't
well-designed for the task that it is currently being used for, and I use
those modules to keep the old code limping along while I'm busy redesigning
and rewriting.

but at any rate
well, I'm open to suggestions...

Based on your last code, if I understand what you are doing and if you
fix the $err thing and make the changes that go with that, you don't need
to read the line from $rdr at all.

Just call eof($rdr) right off the bat. If it returns true, that means
the program exited (or at least closed its own stdout) without producing
any output on stdout. So then you read $err (or *ERR), plus you can wait
on $pid and see what $? is. Based on that, you decide if it was a
error-free empty output, or a error-caused empty output, and either die, or
return the (already at eof, but who cares) $rdr. If eof($rdr) returns
false, it means that at least something was printed to stdout, so in your
model that would mean there is no error, so return the file handle. This
could deadlock if the error message is more than one buffer in size, but as
you say it is can only ever be exactly zero or one lines, that seems
unlikely.

Another issue is that, if this subroutine is invoked many times during
the lifetime of the program, you need a way to wait on the child or
you will accumulate zombies.

Your idea of saving the entire
output of the command to a file is unpalatable because this output
*can* be huge and the function downstream may need only the first
few hundred lines of this huge output. It just so happens that,
by far, the most common possible outcomes for '/some/command' are
either success with a many lines of output or immediate failure
with a one-line error message.

Even if the alternatives besides those two are very rare, do they still
need to be dealt with? I've often painted myself into a corner by using
a method that made it easy to deal with 99.99% but then made it nearly
impossible to deal with the other 0.01%. So now I'm very leery about
things that look like they are starting down that road.

Hence, the idea of checking for
eof after the first line happens to work in this special case.

In your most recent code as it is (i.e. not corrected for the $err thing),
you will report an error if there the program has no error and produces
exactly one line of output. Since you previously said that it is legal
to produce exactly zero lines of output on non-error conditions, I suspect
it is also legal to produce exactly one. So you may falsely be reporting
errors.

But the downstream function requires a filehandle (as opposed to
a pair of handle and string), hence the business with unread.

As I said, I am somewhat skeptical if this approach is to be used as a base
for future growth and evolution. However, if you (correctly) suspect that
your code will not keep evolving in new directions once you have this
particular issue worked out, then I guess this is of lesser concern.

<the rest snipped to ease my pain>

To say that there's egg on my face would be an understatement...
I had in fact read that in the docs not too long ago, and (since
it did not make much sense at the time) I immediately forgot it.
Thanks for the reminder.

The IPC::Open3 behavior is so incredibly counter-intuitive that that
part of the docs just don't sink in the first few times you get bit by it.
It doesn't help that the docs themselves provide absurd examples:

my($wtr, $rdr, $err);
$pid = open3($wtr, $rdr, $err,
'some cmd and args', 'optarg', ...);

Why the heck go to the effort of declaring $err and passing it in,
if the net effect is identical to:

my($wtr, $rdr);
$pid = open3($wtr, $rdr, undef,
'some cmd and args', 'optarg', ...);

That is highly misleading.

Xho

kj · Aug 21, 2007

In said:
Just call eof($rdr) right off the bat.

That works great, reducing the function to:

use IPC::Open3 'open3';
use Symbol 'gensym';

sub foo {
my $err = gensym;
my $pid = open3( my ( $wtr, $rdr ), $err,
'/some/command', @_ );
close $wtr;
die join '', <$err> if eof $rdr;
return $rdr;
}

Thanks.

I'm still mulling over the zombies issue...

Maybe...

use IPC::Open3 'open3';
use Symbol 'gensym';
use IO::Handle;

sub foo {

my $rdr = Ugly::Kluge->new;

my $err = gensym;
my $pid = open3( my $wtr, $rdr, $err,
'/some/command', @_ );

${ *$rdr{ SCALAR } } = $pid; # Yeeeew!

close $wtr;
die join '', <$err> if eof $rdr;
return $rdr;
}

{
package Ugly::Kluge;
use POSIX 'SIGTERM';
our @ISA = 'IO::Handle';
sub DESTROY {
my $self = shift;
if ( defined( my $pid = ${ *$self{ SCALAR } } ) ) {
kill SIGTERM, $pid;
waitpid( $pid, 0 );
}
$self->SUPER:

ESTROY;
}
}

OK, time to open the windows here, and air out the place...

kj

J. Gleixner · Aug 21, 2007

kj said:
That works great, reducing the function to:

use IPC::Open3 'open3';
use Symbol 'gensym';

sub foo {
my $err = gensym;
my $pid = open3( my ( $wtr, $rdr ), $err,
'/some/command', @_ );
close $wtr;
die join '', <$err> if eof $rdr;

Why call join?

die <$err> if eof $rdr;

xhoster · Aug 21, 2007

kj said:
That works great, reducing the function to:

use IPC::Open3 'open3';
use Symbol 'gensym';

sub foo {
my $err = gensym;
my $pid = open3( my ( $wtr, $rdr ), $err,
'/some/command', @_ );
close $wtr;
die join '', <$err> if eof $rdr;
return $rdr;
}

I think you previously said that /some/command can legitimately
produce no output under some non-error conditions. If so, that means you
have to do something more than just check eof $rdr before dying, you also
need to check $err or $?. Assuming $? doesn't tell you anything you need
to know and $err tells it all, then something like:

if (eof $rdr) {
return $rdr if eof $err; # no error message, means no error
waitpid $pid,0; # avoid zombies if die isn't really fatal
die <$err>;
};
return $rdr;

Thanks.

I'm still mulling over the zombies issue...

Since you die on errors, rather than logging and going on (well,
you could be invoking this from an eval {} and doing logging from
outside, so maybe the die doesn't really mean die...) I was kind of
hoping this was a single-use program and you could just forget about
zombies.

Maybe you could double-fork so the OS cleans up zombies automatically,
although it isn't obvious to me how to do that cleanly and simply with
Open3. Or you could set $SIG{CHLD} to "IGNORE", but that might screw up
other parts of your code if those parts also use fork, system, etc. and are
expecting $SIG{CHLD} to stay at its default.

${ *$rdr{ SCALAR } } = $pid; # Yeeeew!

Hmm. That makes me wonder, when you do an ordinary pipe open
(not IPC::Open? open), the corresponding close automatically waits on the
child. How does it know what pid to wait on? The pid must be stored
somewhere in the resulting file handle, but where? I tried finding it with
Devel:

eek, but couldn't. Considering my lack of experience with
Devel:

eek, I guess that that isn't surprising.

Xho

anno4000 · Aug 22, 2007

[...]

Even if the alternatives besides those two are very rare, do they still
need to be dealt with? I've often painted myself into a corner by using
a method that made it easy to deal with 99.99% but then made it nearly
impossible to deal with the other 0.01%. So now I'm very leery about
things that look like they are starting down that road.

That is an astute observation. Similar to error handling, ignoring the
0.01% often leads to a simple, clean solution -- that becomes a an utter
mess of spaghetti once you start dealing with the remaining cases. There
ought to be a design pattern for that situation. (There probably is, I
wouldn't know.)

Anno

anno4000 · Aug 22, 2007

[...]

Hmm. That makes me wonder, when you do an ordinary pipe open
(not IPC::Open? open), the corresponding close automatically waits on the
child. How does it know what pid to wait on?

It has been my understanding that close() waits for one child to finish,
never mind the PID. That's why specific "waitpid $my_known_pid" doesn't
mix well with system() and friends -- Perl's child handling may have
snatched the PID you're waiting for.

Then again, perldoc -f close is specific:

Closing a pipe also waits for the process executing
on the pipe to complete...

which would be... well, a simplification if my understanding were true.

The pid must be stored
somewhere in the resulting file handle, but where? I tried finding it with
Devel:eek, but couldn't. Considering my lack of experience with
Devel:eek, I guess that that isn't surprising.

Well, there's glob magic involved. If it's hidden there, it wouldn't be
evident from a Devel:

eek:

ump of the file handle. Another (less likely)
possibility would be to associate the PID via the handle's refaddr (inside-
out style). That would leave no traces at all in the file handle.

Anno

xhoster · Aug 22, 2007

[...]

Hmm. That makes me wonder, when you do an ordinary pipe open
(not IPC::Open? open), the corresponding close automatically waits on
the child. How does it know what pid to wait on?

Click to expand...

It has been my understanding that close() waits for one child to finish,
never mind the PID.

A linux strace of a simple program fork open and close on my system shows
that it waits for the specific pid, using "wait4". Of course this could be
a system dependent thing.

That's why specific "waitpid $my_known_pid" doesn't
mix well with system() and friends -- Perl's child handling may have
snatched the PID you're waiting for.

I've never noticed that problem (again, on linux). In my experience it is
the *unspecific* waitpid (i.e. waitpid -1,... Or just regular wait) done
in a $SIG{CHLD} handler that doesn't play nicely with system and qx.

Well, there's glob magic involved. If it's hidden there, it wouldn't be
evident from a Devel:eek:ump of the file handle. Another (less
likely) possibility would be to associate the PID via the handle's
refaddr (inside- out style). That would leave no traces at all in the
file handle.

I just looked at the source for 5.8.7. It seems to store the pid in
some secret array, with the fd as the index.

Xho

anno4000 · Aug 23, 2007

[...]

Hmm. That makes me wonder, when you do an ordinary pipe open
(not IPC::Open? open), the corresponding close automatically waits on
the child. How does it know what pid to wait on?

Click to expand...

It has been my understanding that close() waits for one child to finish,
never mind the PID.

Click to expand...

[...]

I just looked at the source for 5.8.7. It seems to store the pid in
some secret array, with the fd as the index.

Ah, that makes sense. Thanks for the correction.

Anno

C pipe	1	Dec 9, 2021
FAQ 8.26 Why doesn't open() return an error when a pipe open fails?	0	Apr 28, 2011
stumped	4	Mar 29, 2008
warning about pipe-to-self opens	3	Sep 14, 2012
FAQ 8.23 How can I open a pipe both to and from a command?	0	Apr 21, 2011
Key Factors in JavaScript Testing from a QA Engineer's Perspective	0	May 29, 2024
Trouble accessing a value within a JSON string.	1	Jun 16, 2023
stumped by graphics display problem...	8	May 31, 2007

Stumped: returning a read pipe from a function

kj

Gunnar Hjalmarsson

kj

kj

xhoster

Gunnar Hjalmarsson

Gunnar Hjalmarsson

xhoster

comp.llang.perl.moderated

kj

xhoster

kj

xhoster

kj

J. Gleixner

xhoster

anno4000

anno4000

xhoster

anno4000

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads