newbie dealings with OS "grep -P"

P

Pedro Graca

My system grep does not support the -P option (perl regexp pattern).
So I thought it would be a good idea to make a perl script for that
(after all, only Perl parses perl, right?)

my first working attempt is grepp1 at the bottom


And then I realized this wouldn't allow me to pipe things to grepp1,
so out comes the second attempt (grepp2)


By now I'm using grepp2 very happily until I needed a case sensitive
match :(
After a tough time, this came out:
$ cat grepp3
#!/usr/bin/perl -nsw
# use strict; ## BEGIN does not like strict

# grepp [-I] pattern [file1 [file2 [...]]]

BEGIN {
$expr = shift; ## need error checking!!!
$regex = qr/$expr/i;
{ no warnings; ## -I might not have been specified
if ($I) { $regex = qr/$expr/; }
} # end no warnings
$\ = "\n"; ## nice little trick
}

chomp;
print if /$regex/;


Now I'm working on grepp4, trying to add options from the 'real' grep.
+ -c == only count matches
+ -H == print the file name
+ -n == print the line number
+ -V == print version and exit

but I don't like having the "use strict;" out.

Is it better to manually shift all parameters and verify if they're
options or the pattern or filenames?

Did I do something wrong with the "perl -s", "use strict;" and "BEGIN"
block?




=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
$ cat grepp1
#!/usr/bin/perl -w
use strict;

# grepp pattern file

my $pt = shift; ## need error checking!!!!
my $fn = shift; ## need error checking!!!!

open (my $FILE, '<', $fn) || die $!;
while (my $line = <$FILE>) {
chomp $line;
print "$line\n" if $line =~ /$pt/i;
}


$ cat grepp2
#!/usr/bin/perl -nw
use strict;

# grepp pattern [file1 [file2 [...]]]

my $pt = shift; ## need error checking!!!!

$\ = "\n"; ## nice little trick :)
while (<>) {
chomp;
print if /$pt/i;
}
 
B

Bob Walton

Pedro said:
My system grep does not support the -P option (perl regexp pattern).
So I thought it would be a good idea to make a perl script for that
(after all, only Perl parses perl, right?)

my first working attempt is grepp1 at the bottom


And then I realized this wouldn't allow me to pipe things to grepp1,
so out comes the second attempt (grepp2)

....

You might check out what the folks at the Unix Reconstruction Project
did with grep:

http://www.perl.com/language/ppt/src/grep/index.html
 
B

Ben Morrow

Pedro Graca said:
#!/usr/bin/perl -nsw
# use strict; ## BEGIN does not like strict

There is no problem with BEGIN and strictures. You do have to make
sure all variables are declared before you use them: if you are going
to use them both inside and outside the BEGIN block then the
declaration must come before the BEGIN block. So here you would need:
# grepp [-I] pattern [file1 [file2 [...]]]

my ($expr, $regex);
BEGIN {
$expr = shift; ## need error checking!!!
$regex = qr/$expr/i;
{ no warnings; ## -I might not have been specified

Doesn't matter... undef is perfectly false and won't give a warning
here. $expr might, though, if you don't specify an expression.
if ($I) { $regex = qr/$expr/; }
} # end no warnings
$\ = "\n"; ## nice little trick

Another nice trick is -l, which does the chomp as well.
}

chomp;
print if /$regex/;


Now I'm working on grepp4, trying to add options from the 'real' grep.
+ -c == only count matches
+ -H == print the file name
+ -n == print the line number
+ -V == print version and exit

but I don't like having the "use strict;" out.

That is a good instinct. Cultivate it... :)
Is it better to manually shift all parameters and verify if they're
options or the pattern or filenames?

I would suggest that it would be better to use one of the Getopt
modules from CPAN, probably Getopt::Std.
Did I do something wrong with the "perl -s", "use strict;" and "BEGIN"
block?

If you use -s, then the variable it creates are globals, so you must
declare them with 'our' rather than with 'my'.

Ben
 
A

Anno Siegel

Pedro Graca said:
My system grep does not support the -P option (perl regexp pattern).
So I thought it would be a good idea to make a perl script for that
(after all, only Perl parses perl, right?)

my first working attempt is grepp1 at the bottom


And then I realized this wouldn't allow me to pipe things to grepp1,
so out comes the second attempt (grepp2)


By now I'm using grepp2 very happily until I needed a case sensitive
match :(
After a tough time, this came out:
$ cat grepp3
#!/usr/bin/perl -nsw
# use strict; ## BEGIN does not like strict

BEGIN and strict have no problems with each other. Just declare your
variables, as strict requires.
# grepp [-I] pattern [file1 [file2 [...]]]

BEGIN {
$expr = shift; ## need error checking!!!
$regex = qr/$expr/i;
{ no warnings; ## -I might not have been specified
if ($I) { $regex = qr/$expr/; }
} # end no warnings
$\ = "\n"; ## nice little trick

....but only needed if you insist in chomp()ing the input.
}

chomp;
print if /$regex/;


Now I'm working on grepp4, trying to add options from the 'real' grep.
+ -c == only count matches
+ -H == print the file name
+ -n == print the line number
+ -V == print version and exit

but I don't like having the "use strict;" out.

Is it better to manually shift all parameters and verify if they're
options or the pattern or filenames?

Did I do something wrong with the "perl -s", "use strict;" and "BEGIN"
block?

Well, you need to declare your variables with strict. A first attempt
would look like this (I'm leaving the -I flag out, that's not the problem).
Note that this code does *not* work:

#!/usr/local/bin/perl -nsw

my $regex; # we need that outside the BEGIN block
BEGIN {
my $expr = shift;
$regex = qr/$expr/;
}

print if /$regex/;

The problem is that -n constructs a loop around the whole script,
which means that the run-time effect of "my $regex" happens each
time around the loop. But the run-time effect of my() is to clear
the variable, so $regex is not set when it is needed.

The easiest way out is to use a package variable for $regex. So a
working variant of your script would be:

#!/usr/local/bin/perl -nsw
use strict; ## BEGIN does not like strict

# grepp [-I] pattern [file1 [file2 [...]]]

our $regex;
BEGIN {
my $expr = shift; ## need error checking!!!
$regex = qr/$expr/i or die "invalid pattern: /$expr/"; # like this? :)
if (our $I) {
$regex = qr/$expr/;
}
}

print if /$regex/;

A better solution would be to write the loop yourself and keep
"my $regex" out of it. BEGIN is just a trick to keep some code
out of the loop, but it doesn't work very well in this case.

You may want to take a look at the _Perl Cookbook_, which devotes
a whole chapter to grep-like programs. (The first edition did,
I guess the second one still does.)

[more grep's snipped]

Anno
 
P

Pedro Graca

Ben said:
Pedro Graca said:
#!/usr/bin/perl -nsw
# use strict; ## BEGIN does not like strict

There is no problem with BEGIN and strictures. You do have to make
sure all variables are declared before you use them: if you are going
to use them both inside and outside the BEGIN block then the
declaration must come before the BEGIN block. So here you would need:
# grepp [-I] pattern [file1 [file2 [...]]]

my ($expr, $regex);

Does not work. Have to make them 'our'
our ($expr, $regex, $I);
Doesn't matter... undef is perfectly false and won't give a warning
here. $expr might, though, if you don't specify an expression.

Problem was with $I. When I tried
grepp -I pattern file
there would be no warning, but with
grepp pattern file
I'd get
Name "main::I" used only once: possible typo at ...
[ grepp symlinks to the latest greppn ]

Now, with $I declared with 'our' everything is ok :)

.... well, if I try grepp -X it complains "main::X" used only once

hopefully I'll have that dealt with in grepp4 (or maybe grepp57)
Another nice trick is -l, which does the chomp as well.

two more lines off of my code.
That is a good instinct. Cultivate it... :)

It's back on, right where it should!
I would suggest that it would be better to use one of the Getopt
modules from CPAN, probably Getopt::Std.

Thank you. That module does not get installed by default, but dealing
with command-line options is such a daunting task I do not feel like
reinventing the wheel for that.
If you use -s, then the variable it creates are globals, so you must
declare them with 'our' rather than with 'my'.

Ah! that's it!


Thank you very much for your answers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

eval within grep not working 1
A question about grep 11
map and grep again... 3
grep 3
guestbook.cgi :p 5
Odd behaviour on Mac OS X Lion 11
Problem Splitting Text String 2
Regex basic question 2

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top