Perl 'gotchas'

  • Thread starter it_says_BALLS_on_your_forehead
  • Start date
I

it_says_BALLS_on_your_forehead

has anyone been burned by using:

for ( @list ) {
...blah blah
}


rather than

for my $elem ( @list ) {
...blah blah
}


....? it's probably better to use the latter loop, since it's more
self-documenting, and there's an explicit 'lexicalization' of the list
element to be used in the loop, but many times I like to take advantage
of the shortcuts that come with the $_ variable. regex matching, s///,
print, (others?).

i believe for the vast majority of use cases, it doesn't matter which
loop is employed ( other than the slightly decreased readability of the
first style ), but i was wondering if people could relate experiences
where use of the first style of for loops has bitten them. i have the
same question about:

while ( <$fh> ) {
..blah
}

and

while ( my $line = <$fh> ) {
..blah
}
 
A

Anno Siegel

it_says_BALLS_on_your_forehead said:
has anyone been burned by using:

for ( @list ) {
...blah blah
}


rather than

for my $elem ( @list ) {
...blah blah
}


...? it's probably better to use the latter loop, since it's more
self-documenting, and there's an explicit 'lexicalization' of the list
element to be used in the loop, but many times I like to take advantage
of the shortcuts that come with the $_ variable. regex matching, s///,
print, (others?).

A lot of Perl functions take $_ as the default argument. Damian advises
against using that default in _PBP_, mostly because a considerable minority
of functions *don't* take $_ and it is hard to tell which is which.

At least in informal programming, I still use it with defined(),
lc() and family, chr(), ord(), split() without arguments and a few
others.
i believe for the vast majority of use cases, it doesn't matter which
loop is employed ( other than the slightly decreased readability of the
first style ), but i was wondering if people could relate experiences
where use of the first style of for loops has bitten them. i have the
same question about:

while ( <$fh> ) {
..blah
}

and

while ( my $line = <$fh> ) {
..blah
}

Bitten how? I can't remember having trouble with either of these
constructs. It is mostly the complexity of the loop body (sheer
size and number of *other* variables used) that decides whether
I name the loop variable or use $_.

Anno
 
P

Paul Lalli

it_says_BALLS_on_your_forehead said:
has anyone been burned by using:

for ( @list ) {
...blah blah
}


rather than

for my $elem ( @list ) {
...blah blah
}


...? it's probably better to use the latter loop, since it's more
self-documenting, and there's an explicit 'lexicalization' of the list
element to be used in the loop, but many times I like to take advantage
of the shortcuts that come with the $_ variable. regex matching, s///,
print, (others?).

i believe for the vast majority of use cases, it doesn't matter which
loop is employed ( other than the slightly decreased readability of the
first style ), but i was wondering if people could relate experiences
where use of the first style of for loops has bitten them.

There is one semi-common "gotcha" in using the implicit $_ to iterate
over an array. If you call any function in that loop which happens to
change $_, you're changing the array. Observe:
#!/usr/bin/perl
use strict;
use warnings;

my @foo = (1..10);

for (@foo){
do_stuff() if $_ == 3;
}

print "Foo now: @foo\n";

sub do_stuff {
while (<DATA>){
chomp;
last if $_ < 0;
}
}

__DATA__
3
4
9
-10
5
1

Output:
Foo now: 1 2 -10 4 5 6 7 8 9 10

To be safe, you have to explicitly localize $_ in any function that
changes $_ that is called from a loop that implicitly uses $_.

Paul Lalli
 
A

Anno Siegel

Paul Lalli said:
There is one semi-common "gotcha" in using the implicit $_ to iterate
over an array. If you call any function in that loop which happens to
change $_, you're changing the array. Observe:

But that isn't specific to using $_. Any loop variable will have that
property.

Anno
 
P

Paul Lalli

Anno said:
But that isn't specific to using $_. Any loop variable will have that
property.

Yes, but $_ is unique in that if it is changed in any *function that
your loop calls*, then the array is modified. If you modify $_ (or
whatever the loop variable is) within your loop itself, then you
probably meant to change the array. The gotcha comes when your loop
calls a separate function which just so happens to set $_ to some other
variable. This does not happen with an explict loop variable:
#!/usr/bin/perl
use strict;
use warnings;

my @foo = (1..10);
my $elem;

for $elem (@foo){
do_stuff() if $elem == 3;
}

print "Foo: @foo\n";

sub do_stuff {
while ($elem = <DATA>){
chomp $elem;
last if $elem < 0;
}
}

__DATA__
4
54
-999
32

Output:
Foo: 1 2 3 4 5 6 7 8 9 10
 
I

it_says_BALLS_on_your forehead

Paul said:
Yes, but $_ is unique in that if it is changed in any *function that
your loop calls*, then the array is modified. If you modify $_ (or
whatever the loop variable is) within your loop itself, then you
probably meant to change the array. The gotcha comes when your loop
calls a separate function which just so happens to set $_ to some other
variable. This does not happen with an explict loop variable:
#!/usr/bin/perl
use strict;
use warnings;

my @foo = (1..10);
my $elem;

for $elem (@foo){
do_stuff() if $elem == 3;
}

print "Foo: @foo\n";

sub do_stuff {
while ($elem = <DATA>){
chomp $elem;
last if $elem < 0;
}
}

__DATA__
4
54
-999
32

Output:
Foo: 1 2 3 4 5 6 7 8 9 10

that's worth a sticky :).
 
B

Brian McCauley

it_says_BALLS_on_your_forehead said:
has anyone been burned by using:

for ( @list ) {
...blah blah
}


rather than

for my $elem ( @list ) {
...blah blah
}

That's not a problem for me.
i have the same question about:

while ( <$fh> ) {
..blah
}

and

while ( my $line = <$fh> ) {
..blah
}

Here the problem is much more serious bacause while() doesn't implicity
localize the changes to $_.

Worse still the obvious "fix" for this local($_) can have even more
scarily subtle action-at-a-distance effects. (When $_ is aliased to a
element of a tied agregate - for details see my contribution to
numerous previous threads discussing the same issue).
 
D

Dr.Ruud

Bifco schreef:
next if(/pattern/);


Even cleaner:

next if /pattern/;


See what Deparse makes of it:

$ perl -MO=Deparse,-x7 -e 'next if /pattern/'

/pattern/ and next;
 
X

xhoster

But that isn't specific to using $_. Any loop variable will have that
property.

Except that non $_ loop variables are generally going to be lexical, and
thus other functions can't get their hands on them unless you specifically
pass it to them.

Xho
 
X

xhoster

it_says_BALLS_on_your_forehead said:
has anyone been burned by using:

for ( @list ) {
...blah blah
}

rather than

for my $elem ( @list ) {
...blah blah
}

...?
Sure.

it's probably better to use the latter loop, since it's more
self-documenting, and there's an explicit 'lexicalization' of the list
element to be used in the loop, but many times I like to take advantage
of the shortcuts that come with the $_ variable. regex matching, s///,
print, (others?).

A lot of others, notably chomp.
i believe for the vast majority of use cases, it doesn't matter which
loop is employed ( other than the slightly decreased readability of the
first style ), but i was wondering if people could relate experiences
where use of the first style of for loops has bitten them.

As others have mentioned, if you call code which fails to localize $_ then
they can change your $_. So I tend not to use the implicit $_ in loops
where I call external code.

Also, I've often had to go back and lexicalize the foreach iterator when
I've added a map or grep in the loop, in which I need to access both the
map/grep implicit $_ and the formerly implicit $_ of the foreach.

Generally, if I still need to access the iterator variable more than 4
lines into the loop, use an explicit iterator.

i have the
same question about:

while ( <$fh> ) {
..blah
}

This is OK:

while (<$fh>) {
chomp;
/(\D+)0*(\d+)/ or die;
my ($foo,$bar)=($1,$2);
### 25 more lines, none of which need the current contents of $_
}

However, if those 25 later lines still want access $_, then I would use
the explicit, like below:
while ( my $line = <$fh> ) {
..blah

Xho
 
A

Anno Siegel

Except that non $_ loop variables are generally going to be lexical, and
thus other functions can't get their hands on them unless you specifically
pass it to them.

Yes, my remark tended to gloss over that, didn't it.

Again, the accessibility of $_ is that of any other package variable,
but it is much more likely to be changed by accident.

It is probably best not to assign to $_ directly, except on top level.
In a general-purpose sub, there is usually a way to alias $_ to what
you want it to be, sometimes a one-shot for loop. The one occasion
where that doesn't help is when you need a *copy*, for instance a
changeable copy of a literal. Then aliasing won't do, you want an
actual assignment.

Anno
 
T

Tad McClellan

Bifco said:
I tend to get burned mostly by the way perl has an "array" context
and a scalar context.


Stand by for more burnment if you think that "array context"
is the same as "list context". :)

I'll assume that you really meant "list context" there.

Have to pause and think when you read
something like that.


You read something like that many times a day, and deal with it
just fine without really thinking about it.

But you do that in a natural language, it is unexpected in
a programming language.

You just need to get used to doing it in a programming
language, err, context. :)

See:

though I generally say

while( $_ = $fh->getline() ) { .... }

just to be absolutely positively sure it's a scalar context.


You can be absolutely positively sure that all booleans are
in a scalar context.

That makes sense if you think about it for a bit.

You have to somehow get to a single value, so that you can know
whether it is one of the 4 "false" values or not.
 
T

Tad McClellan

it_says_BALLS_on_your_forehead said:
has anyone been burned by using:

for ( @list ) {
...blah blah
}

rather than

for my $elem ( @list ) {
...blah blah
}

i have the
same question about:

while ( <$fh> ) {
..blah
}


It isn't really the same question because one of those local()izes
the variable and one of them doesn't.

While you can get bitten by either one, it is a lot easier to
get bitten by while() than by foreach().
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,338
Messages
2,571,783
Members
48,589
Latest member
puppyslingcarrier

Latest Threads

Top