Uri Guttman replied:
and it is not commonly used as well. so why did you
even bother to mention it?
Since you asked, I'll tell you:
Have you ever tried teaching Perl to someone who just didn't have a
good grasp of regular expressions? To someone who sort of understood
them, but couldn't get his mind around the fact that the regular
expression "[fee|fie|foe]" is identical to "[feio|]"? Or that, no
matter how manly times you tell him (and get him to agree) that "*"
means "match zero or more occurrances of", he still thinks it's wrong
that ".*" can successfully match an empty string?
I've dealt with people like this. And I've found that, in many
cases, instead of spending five or ten minutes explaining to them why
the use of $`, $&, and $' is a bad idea and get them to write through
hoops avoiding their use (hoops for them, not for me), it's just best
to teach them the simpler solution first and let them learn that. The
milliseconds of run time they waste from running their script with the
"poorer" solution more than makes up for the time lost writing their
code in a way that is not understood very well, finding/correcting bugs
that may result, and the additional time explaining how to "do it
right."
I personally think that the line:
$string = $' if "abc=xyz" =~ m/=/;
is easier to read and understand than the line:
$string = $1 if "abc=xyz" =~ m/=(.*)$/;
I have a feeling that you might disagree. Whatever the case, I've
found that, even when a beginnier Perl programmer understands each of
the symbols in the pattern-match "m/=(.*)$/", he can still have a
difficult time putting them all together to deduce the purpose of the
entire match. In those cases, I've found that it's good to start out
simple (like "m/=/") and then expand to a more complex explanation if
necessary.
As for the performance penalty of using $`, $&, and $', I believe
that it's not as bad as most people think. The performance penalty
does not have a behavior of N-SQUARED (measured in Big-O notation). I
tend to think (but I could be wrong) that it's more on the order of N,
which isn't really all that bad in the big scheme of things. You may
save millisends running your script if you remove all instances of $`,
$&, and $', but it's not going to remove any bottleneck that's taking
too much processor time. In fact, I'd be surprised if you saved over
ten seconds by running a script (with $`, $&, and $' removed)
repeatedly for one whole day straight. In fact, I'm not aware of
anyone who has ever removed $`, $&, and $' from his scripts to find
that the scripts ran noticeably faster. (And if there's evidence of
this to the contrary, I'd be interested in knowing about it.)
And I've wondered: just how much IS the performance penalty,
anyway? I decided to perform a Benchmark test:
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark;
my $count = 1e7;
my $bad = q!$string = $' if "abc=xyz" =~ m/=/!;
my $good = q!$string = $1 if "abc=xyz" =~ m/=(.*)$/!;
timethese($count, {bad => $bad, good => $good})
__END__
The results surprised me. I got the following output:
Benchmark: timing 10000000 iterations of bad, good...
bad: 18 wallclock secs (16.42 usr + -0.03 sys = 16.39 CPU)
@ 610090.90/s
good: 22 wallclock secs (20.73 usr + 0.00 sys = 20.73 CPU)
@ 482299.60/s
Apparently, the "bad" code (with $') ran faster than the "good" code!
This seems strange considering that use of $' is supposed to incur a
performance penalty, not an optimization. Thinking about this, I would
come to the conclusion that the penalty mainly happens on the regular
expressions that don't use $`, $&, or $' in that they use extra
processor time figuring out these variables when they don't need them.
In other words, if every single regular expression used $', there would
be no real performance penalty at all.
If I'm right with that reasoning, then that would actually make the
regular expression that uses $' the preferred choice for one-line Perl
scripts.
So it looks like there is a reason to learn $' after all. And even
if your script runs slightly slower as a result of using it, I doubt
you'll ever notice the difference.
nor did you tell the OP where to learn about regexes or whatever.
It's often difficult to tell the tone of a response in a plain-text
message, but it seems like you're irritated at me for some reason, Uri.
I don't know whether you are irritated at me personally or at just my
post, but if I said something that offended you, then I apologize.
It's just that when I read a message like yours where first I'm
criticized for putting in too much information and then criticized for
not putting enough in, I get the impression that it's not my post you
are upset at, but at me personally.
When I post messages to UseNet, I usually post messages that I think
will be helpful to a user, depending on what level of experience I
think he's at. Obviously, if I think he's a beginner, the information
I post will probably not be helpful to an advanced programmer. Of
course, I could be wrong about what is helpful and what is not, but
that's the beauty of UseNet -- everyone is free to post what they want.
Again, Uri, I'm sorry if I offended you in this post or in an
earlier post.
-- Jean-Luc