$var = do { ... }?

T

Tim McDaniel

I just had a case where there's a block of code, I split up the work
into assignments to intermediate variables, but I only wanted the
final value. I very much like to restrict the scope of variables,
because it's then obvious what's a temporary and what has more
significance -- it's like an intermediate stage between inline code
and a sub. How I coded it was

my $permanent_variable;
{
my $this = ...;
my $that = ... $this ...;
yadda yadda;
$permanent_variable = ... final computation ...;
}

It occurred to me that I could code it as

my $permanent_variable = do {
my $this = ...;
my $that = ... $this ...;
yadda yadda;
... final computation ...;
};

It does do the scope encapsulation that I like, and it makes it
vividly obvious that the block has one purpose, to set
$permanent_variable.

I looked thru the codebase at work and found a few instances of it.
But mostly "do" was used to implement a slurp function, or otherwise
to "local()" a variable for one statement or call.

For people who have looked at lots of different codebases: do people
think that

$var = do { several statement; }

is just an odd construct?
 
R

Randal L. Schwartz

Tim> I looked thru the codebase at work and found a few instances of it.
Tim> But mostly "do" was used to implement a slurp function, or otherwise
Tim> to "local()" a variable for one statement or call.

I do that all the time, although it often suggests that what I really
have is a subroutine, and I should pull it out and make it one.

Tim> For people who have looked at lots of different codebases: do people
Tim> think that

Tim> $var = do { several statement; }

Bear in mind that do { } is an expression, so you'll typically need a
semicolon after that before the next statement (common mistake).

print "Just another Perl hacker,"; # the original
 
T

Tim McDaniel

I agree with Randal on both counts: I use do like this all the time,
and you should seriously consider making the block a sub instead.

Yeah, I tend to do too much inline code and end up with (for example)
300-line blocks of code. You know, though, that it can be annoying to
subify something when it has lots of external dependencies, whether
inputs or outputs.
If you mean the

my $txt = do {
open my $F, "<", ...;
local $/;
<$F>;
};

construction then this is exactly the same situation, isn't it?

If the *only* use of it is as
... = do {local $/; <HANDLE>};
or the vast majority of use, then it can just be viewed as an idiom
for one special task. E.g., about the only time I use hash slices is
my %table;
@table{@array} = (1) x @array;
(though I'm reconsidering using instead
my %table = map { $_ => 1 } @array;
).
 
R

Rainer Weikusat

I just had a case where there's a block of code, I split up the work
into assignments to intermediate variables, but I only wanted the
final value. I very much like to restrict the scope of variables,
because it's then obvious what's a temporary and what has more
significance
[...]

my $permanent_variable = do {
my $this = ...;
my $that = ... $this ...;
yadda yadda;
... final computation ...;
};

It does do the scope encapsulation that I like, and it makes it
vividly obvious that the block has one purpose, to set
$permanent_variable.
[...]

$var = do { several statement; }

is just an odd construct?

I decidedly do consider this an odd construct: If you have a
self-contained piece of code whose purpose is to perform some
operation independently of the surrounding code and to return some
value, and which possible has a small number of well-defined inputs,
this should be put into a subroutine with a sufficiently descriptive
name that someone who reads through the outer code knows what the
purpose of the subroutine happens to be but without having to bother
with the details of its implementation and that someone who cares
about the implementation of this particular subroutine doesn't have to
go hunting for it in a large block of otherwise unrelated code.

A real-world example of that:

sub validate_customer_key($)
{
my ($bin_ckey, $ckey, $skey, $skey_version);

eval {
($bin_ckey, $skey_version) = decode_bin_ckey($_[0]);

$skey = get_server_key($skey_version);
die("no version $skey_version server key") unless $skey;

dec($bin_ckey, $skey->key());
$ckey = MECSUPD::CustomerKey->new_from_string($bin_ckey);
die("not a valid customer key") unless $ckey;

check_cookie($ckey, $skey);
check_expiry($ckey);
check_ckey_cid_version($ckey);
};

$@ && do {
syslog('ERR', "$@");
return;
};

syslog('INFO', 'customer %08x authenticated successfully',
$ckey->cid());
return 1;
}

['dec' means 'decrypt' here]

This is (not counting external libraries) 'a meeting point' of 80
lines of Perl code and 20 lines of C code and the algorithm this
subroutine is supposed to perform would be totally obliterated if the
100 lines of code implementing all the details had been used in place
of this 22 lines (sloccount) high-level description.
 
T

Tim McDaniel

[an eval here]
$@ && do {
syslog('ERR', "$@");
return;
};

Is there a reason to prefer that over
if ($@) {
syslog('ERR', "$@");
return;
}
? I don't see a reason, so I prefer the "if" version.
 
R

Randal L. Schwartz

Tim> Yeah, I tend to do too much inline code and end up with (for example)
Tim> 300-line blocks of code. You know, though, that it can be annoying to
Tim> subify something when it has lots of external dependencies, whether
Tim> inputs or outputs.

Actually, that'll make your code cleaner. If you have a lot of
undelimited cohesion (I think that's the word), bugs are much harder to
find, and the code is more fragile.
 
R

Rainer Weikusat

Ben Morrow said:
Quoth (e-mail address removed):
[an eval here]
$@ && do {
syslog('ERR', "$@");
return;
};

Is there a reason to prefer that over
if ($@) {
syslog('ERR', "$@");
return;
}
? I don't see a reason, so I prefer the "if" version.

You should always prefer Try::Tiny over an explicit eval: there are,
unfortunately, rather a lot of nasty corner cases in perl's handling of
$@, and Try::Tiny deals with as many as it can.

It is a lot better to state what these corner-cases are than to make
nebulous scare-mongering statements about them. I'm aware of one: Code
which is executed as part of a destructor and which doesn't localize
$@ properly may cause it to be cleared or set to a different value. In
particular, this will happen when the destructor calls syslog (that's
were I encountered it). This is, however, not applicable here and in
any case, the solution is to fix the destructors. I'll happily learn
about other such cases. But according to the Try::Tiny documentation,
there are none: The only other thing it mentions is that the 'eval'
might accidentally clear $@ if it is running inside another eval and
$@ wasn't properly localized ... but didn't we have this already?
If you insist on doing the eval yourself, you should test the truth of
the eval

eval {
...
} or do {
...
};

rather than relying on $@,

No, I shouldn't "always do this" because I usually know what the code
will be doing when executed, ie, in this case, that there are neither
outer nor inner evals. I understand that this is more difficult for
someone whose preferred solution to any technical problem is "download
150,000 lines of unknown code from the internet" (in order to save
writing 5 lines of code). Regarding this, I'm a follower of the theory
that blind use of complex devices with unknown properties (third-party
written maxi-mega-modules intended to solve 15,000 different trivial
problems with a huge amount of "heavily optimized general-purpose
code") is bound to cause accidental deaths and other nuisances and
therefore, I avoid such situations (fixing the inevitable bugs in a
large body of unknown code is going to take more time than writing a
small amount of new code).
 
R

Rainer Weikusat

[an eval here]
$@ && do {
syslog('ERR', "$@");
return;
};

Is there a reason to prefer that over
if ($@) {
syslog('ERR', "$@");
return;
}
? I don't see a reason, so I prefer the "if" version.

I'm not aware of any, that's IMHO just a matter of personal
preference.
 
R

Rainer Weikusat

Rainer Weikusat said:
Ben Morrow said:
Quoth (e-mail address removed):
[an eval here]
$@ && do {
syslog('ERR', "$@");
return;
};

Is there a reason to prefer that over
if ($@) {
syslog('ERR', "$@");
return;
}
? I don't see a reason, so I prefer the "if" version.

You should always prefer Try::Tiny over an explicit eval: there are,
unfortunately, rather a lot of nasty corner cases in perl's handling of
$@, and Try::Tiny deals with as many as it can.

It is a lot better to state what these corner-cases are than to make
nebulous scare-mongering statements about them.

To state this in a clearer way: The two cases this module is supposed
to deal with are

-------------
package a;

sub DESTROY
{
eval { 3 + 1; };
}

package main;

eval {
my $a;
bless(\$a, 'a');

die("gruesome error!");
};

$@ and print("$@\n");
-------------

Upon exiting the eval scope, the a::DESTROY routine will be executed
automatically and the eval in there cause $@ to be cleared. The
solution to this problem is to add a

local $@ if $@

to the beginning of any destructor which does something none-trivial
which might either invoke die or eval. Since a destructor can be
executed automatically after some other code died but before the
unknowing caller had a chance to look at $@, it certainly shouldn't
change an existing value of $@.

The other is

--------------
sub complex_task
{
eval { 3 + 1; };
}

eval {
die("gruesome error!");
};

complex_task();

$@ and print("$@\n");
---------------

This time, the caller is at fault: If some non-trivial action needs to
be performed before looking at $@, the value of $@ immediately after
the eval needs to be save in a 'non-global' variable until it is going
to be used. This is especially true because there is, as the
'BACKGROUND' text of the Try::Tiny documentation aptly explains, no
way the called routine can easily do this.

IMO, both of these 'nasty corner-cases' are actually fairly trivial
programming errors and the general solution to these is to teach
people how to avoid them, not to try to write code which enables them
to remain blissfully unaware of the problem.
 
M

Martijn Lievaart

No, I shouldn't "always do this" because I usually know what the code
will be doing when executed, ie, in this case, that there are neither

If there is a clear way that is always correct and another equally clear
way that may break on refactoring, I always would choose the first. This
seems like one of those cases (which I didn't know about yet btw).

M4
 
R

Rainer Weikusat

Martijn Lievaart said:
If there is a clear way that is always correct and another equally clear
way that may break on refactoring, I always would choose the first. This
seems like one of those cases (which I didn't know about yet btw).

Testing $@ is 'always correct' while the return value of eval might be
'false' for any number of reasons because it is just the return value
of the last thing executed in the scope of the eval.

----------
my $bc;

my $rc = eval {
$bc = 3;
die('huh?') if $bc != 3;
} or do {
print("Ben Morrow error occurred!\n");
$@ or print("Phew ... that was close ... \n");
};
 
R

Rainer Weikusat

Rainer Weikusat said:
Testing $@ is 'always correct' while the return value of eval might be
'false' for any number of reasons because it is just the return value
of the last thing executed in the scope of the eval.

Additional remark: One of the purposes of using exceptions to signal
errors is to avoid the so-called semipredicate problem where a
technically legitimate return value must be used to signal an
exceptional condition and the caller cannot generally tell the
difference. Eg, assume the eval returns a file descriptor number. The
usual convention for signalling errors via return value for
subroutines doing this would be to use the number -1 which cannot be a
valid file descriptor. But this won't work with the code above because
-1 is logically true. OTOH, 0 is a valid file descriptor number
(usually used for 'the standard input file descriptor') but logically
false.
 
M

Martijn Lievaart

Testing $@ is 'always correct' while the return value of eval might be
'false' for any number of reasons because it is just the return value of
the last thing executed in the scope of the eval.

I think I get your point. You *have* to make sure all exception handlers
are correct (not a biggy) because otherwise you risk screwing up your
evals. Right?

M4
 
R

Rainer Weikusat

Ben Morrow said:
Which is why you should use Try::Tiny, which fixes this for you.

You mean it screws up the return value of an eval?
Localising $@ in DESTROY would not be sufficient, even if you could rely
on all destructors in all code you use doing so.

It is never possible to rely on the fact that some code isn't
buggy. The solution is to fix the bugs AND NOT to download more
(presumably buggy) code which tries to work around them.
See the BACKGROUND section of Try::Tiny's documentation.

Hic Rhodos, hic salta: What are you precisely referring to? I read
this background section and wrote about the two things I found in
there which were not just handwaiving. So, what did I miss and/or get
wrong? According to your opinion, not according to a random text some
random guy uploaded to a random web server in 2009(!), presumably more
than fifteen years after people started to use exception handling in
Perl despite you claim that's impossible without the 2009 guy.
 
D

Dr.Ruud

If you insist on doing the eval yourself, you should test the truth of
the eval

eval {
...
} or do {
...
};

rather than relying on $@, since there are cases (destructors, for one,
depending on your version of perl) where $@ can be cleared even though
the eval failed. You still lose the error, but at least you know there
was one.

Indeed, testing $@ is a bad practice.

Your code snippet is missing some important features, see:

my $ok;
eval {
$ok = foo();
1; # success
}
or do { # exception
my $eval_error = $@ || 'zombie error';
...;
};
 
R

Rainer Weikusat

Dr.Ruud said:
Indeed, testing $@ is a bad practice.

Your code snippet is missing some important features, see:

my $ok;
eval {
$ok = foo();
1; # success
}
or do { # exception
my $eval_error = $@ || 'zombie error';
...;
};

But this does nothing but 'test $@' in order to determine if an error
occurred. There's just an additional bit of useless code wrapped
around it which is based on the (unjustified) assumption that the
value returned by the last thing evaluated inside the eval will never
be 0, '' or undef. Essentially, this amounts abandoning the concept of
using 'exceptions' for out-of-band error signalling and going back to
implicit 'special return value' conventions instead while retaining all
the run-time overhead of the exception mechanism. This can be achieved
in a much clearer way by simply not using exceptions and maybe arguing
in public that anything but using special return-value conventions for
error signalling would be 'bad practice' (because of ...)

OTOH, "that's your opinion" and some people don't agree with that,
cf

http://perl.apache.org/docs/general...eference.html#Exception_Handling_for_mod_perl

Now, where's the 'foaming mouth guy' crying "This is deprecated !
Deprected !! Deprecated !!! Die, sinner, die !!!!"?

Can't you just accept that your vision for Perl is by no means
universal and that people use Perl in many different ways, no matter
if J. Random Multicolor Loser who wasn't asked for his insignificant
opinion to begin with approves of that?

Go Python, "there's only my way or the highway" guy. That's where you
belong to.
 
R

Rainer Weikusat

Rainer Weikusat said:
[...]

Localising $@ in DESTROY would not be sufficient, even if you could rely
[...]
See the BACKGROUND section of Try::Tiny's documentation.

Hic Rhodos, hic salta: What are you precisely referring to? I read
this background section and wrote about the two things I found in
there which were not just handwaiving. So, what did I miss and/or get
wrong?

Since (somewhat expectedly) nothing came of that and I've re-read the
Try::Tiny BACKGROUND section meanwhile with more attention to detail,
I now confidently state that the 'Localizing ... would not be
sufficient' statement is wrong. Adressing each of the 'rationales' in
turn:

,----
| Clobbering $@
|
| When you run an eval block and it succeeds, $@ will be
| cleared, potentially clobbering an error that is currently
| being caught.
|
| [...]
|
| $@ must be properly localized before invoking eval in order to avoid
| this issue.
`----

As I already wrote: This is the documented behaviour of eval and it is
the responsibility of the code which is interested in the content of
$@ to store that in some safe place before invoking other code which
might cause the value of $@ to be changed. That's a side effect of the
design descision to use a global variable to store the most recently
thrown exception. Taking this into account, this design descision can
certainly be called 'somewhat unfortunate', however, this is argueing
about spilt milk: perl behaves in the way it does and application
code written in Perl needs to take this into account. It is in no way
sensible to burden code which is regularly executed with working
around possible errors in the calling code, as suggested in the '$@
must be properly localized'.

,----
| Localizing $@ silently masks errors
|
| Inside an eval block die behaves sort of like:
|
| sub die {
| $@ = $_[0];
| return_undef_from_eval();
| }
|
| This means that if you were polite and localized $@ you can't die in
| that scope, or your error will be discarded (printing "Something's
| wrong" instead).
`----

.... not only is the idea to always localize $@ 'just in case' before
executing eval not sensible, it additionally breaks exception
propagation out of the current lexical scope. But since it is only
necessary to work around this problematic side effect when proactively
trying to work around the non-problem described in the previous
paragraph, this is not generally an issue.

,----
| $@ might not be a true value
|
| This code is wrong:
|
| if ( $@ ) {
| ...
| }
|
| because due to the previous caveats it may have been unset.
`----

But there were no such 'previous caveats', just a remark about the
documented behaviour of eval and how that may interact badly with some
calling code written based on the wrong assumption that $@ would not
be a global variable. Actually, $@ can't be 'unset' except as side
effect of code which runs between the time of the original die and the
time the caller looks at $@. Minus the already mentioned 'caller bug'
of not saving $@, this leaves a single possible problem situation,
namely,

,----
| The classic failure mode is:
|
| sub Object::DESTROY {
| eval { ... }
| }
|
| eval {
| my $obj = Object->new;
|
| die "foo";
| };
|
| if ( $@ ) {
|
| }
`----

This is an actual problem because destructors can be executed after a
die and before the caller of the eval ever gets a chance to look at
$@. As I already wrote, because of this, a destructor which doesn't
localize $@ if it already has a value before executing code which
might either eval or die is broken. It is possible that the value in
$@ isn't interesting to the (indirect) caller anymore and that the
destructor just sees it because $@ is a global variable, but there's
no way to distinguish between these two cases.

Lastly,

,----
| The workaround for this is even uglier than the previous ones. Even
| though we can't save the value of $@ from code that doesn't localize,
| we can at least be sure the eval was aborted due to an error:
|
|
| my $failed = not eval {
| ...
|
| return 1;
| };
`----

there's little point in disabling useful features (out-of-band error
signalling, eval return values) in order to work around hypothetical
bugs in destructors. Instead, the buggy destructors need to be fixed.
 
T

Tim McDaniel

It occurred to me that I could code it as

my $permanent_variable = do {
my $this = ...;
my $that = ... $this ...;
yadda yadda;
... final computation ...;
};

So I have to make sure the code evaluates the desired return value as
the last thing in the block, like

my $result = do {
if ($i % 2 == 0) { 'even' }
elsif ($i % 3 == 0) { 'divisible by 3' }
elsif ($i % 5 == 0) { 'divisible by 5' }
else { 'just wrong' }
};

Is there a clever way in Perl 5 to metaphorically return early with a
value?

"return", in Perl 5.8 and 5.14, returns from a sub, not a block.

"last" is documented in Perl 5.8.8 as

"last" cannot be used to exit a block which returns a value such
as "eval {}", "sub {}" or "do {}", and should not be used to exit
a grep() or map() operation.

Note that a block by itself is semantically identical to a loop
that executes once. Thus "last" can be used to effect an early
exit out of such a block.

So I can use "last" to end early, if I wrap it in an extra {...}.
But "last" doesn't return a value: if I just do "last", the do{...}
experimentally evaluates to undef.

So far as I can see, I can exit a do{...} early AND return a value
only by coding it myself using an extra {...}, like

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

#! /usr/bin/perl
use strict;
use warnings;

for my $i (4, 9, 25, -1) {
my $result = do {
my $return;
{
if ($i % 2 == 0) { $return = 'even'; last };
if ($i % 3 == 0) { $return = 'divisible by 3' }
elsif ($i % 5 == 0) { $return = 'divisible by 5' }
else { $return = 'just wrong' }
}
$return;
};
print $i, (defined $result ? " defined $result" : ' undef '), "\n";
}
exit 0;

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(Of course that's contrived; the original if-elsif-else structure is
a much more concise way of expressing it.)

Is there a clever way in Perl 5 to metaphorically return early with a
value?
 
D

Dr.Ruud

So I have to make sure the code evaluates the desired return value as
the last thing in the block, like

my $result = do {
if ($i % 2 == 0) { 'even' }
elsif ($i % 3 == 0) { 'divisible by 3' }
elsif ($i % 5 == 0) { 'divisible by 5' }
else { 'just wrong' }
};

Is there a clever way in Perl 5 to metaphorically return early with a
value?

That if/elsif/else of yours already does that. Remember that
perl-the-binary compiles to opcodes. So the 'elsif' is only done if the
'if' didn't do.

perl -Mstrict -wle '
my $i = $ARGV[0];
my $result = !$i ? "0"
: !($i % 2) ? "2-fold"
: !($i % 3) ? "3-fold"
: !($i % 5) ? "5-fold"
: "bah";
print $result;
' -- -21
3-fold
 
R

Rainer Weikusat

Ben Morrow said:
Quoth (e-mail address removed):
[...]
my $result = do {
if ($i % 2 == 0) { 'even' }
elsif ($i % 3 == 0) { 'divisible by 3' }
elsif ($i % 5 == 0) { 'divisible by 5' }
else { 'just wrong' }
};

Is there a clever way in Perl 5 to metaphorically return early with a
value?
[...]

The other thing that works, and it is in fact documented though I had no
idea until I just looked, is to return from an eval {}:

my $result = eval {
$_ % 2 == 0 and return "even";
$_ % 3 == 0 and return "divisible by three";
return "just wrong";
};

I'm not sure it's got much to recommend it over

my $result = sub {
$_ % 2 == 0 and return "even";
return "odd";
}->();

though,

The first is using a language construct according to its intended
purpose. The second is abusing a language construct in order to
emulate the first 'somehow'. That alone should be sufficient to avoid
it. In addition to that, it needs more test because the mock
subroutine created for this purpose also needs to be invoked and -
depending on whether the compiler special-cases this so that people
can indulge their passion for the bizarre[*] - it is probably also
less efficient.

[*] 'Xah Lee' should like it, however, since it is quite
'mathy' ...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top