Question on grep and reading from file

G

googler

Inside my Perl script, I had to check if a particular pattern appears
in a certain file or not (only a yes/no answer). I did it as below:
@matching_lines = grep { /$srchpat/ } <MYFILE>;
print "Pattern found\n" if ($#matching_lines != -1);

I was wondering if there is a more efficient way to do this. Is it
possible to use the Unix "grep" command to do this inside my script?
If so, how? Will that be more efficient (faster)?

I have another question. Is there a way to read a particular line in a
file when I know the line number (without using a loop and reading
each line at a time)? I guess the below code would work.
@lines = <MYFILE>;
$myline = $lines[$linenum-1];
But this will read the entire file into the array @lines and can take
up a lot of memory if the file is huge. Is there a more efficient
solution?

Thanks.
 
M

Mirco Wahab

googler said:
Inside my Perl script, I had to check if a particular pattern appears
in a certain file or not (only a yes/no answer). I did it as below:
@matching_lines = grep { /$srchpat/ } <MYFILE>;
print "Pattern found\n" if ($#matching_lines != -1);

I was wondering if there is a more efficient way to do this. Is it
possible to use the Unix "grep" command to do this inside my script?
If so, how? Will that be more efficient (faster)?

In almost all cases, a sequential approach will be *much*
faster on *large* files (>= 100MB), like

<pseudo>
...
my @matching_lines;
while( <MYFILE> ) {
push @matching_lines, $_
if /$srchpat/
}
print "Pattern found\n"
if scalar @matching_lines;
...
I have another question. Is there a way to read a particular line in a
file when I know the line number (without using a loop and reading
each line at a time)? I guess the below code would work.
@lines = <MYFILE>;
$myline = $lines[$linenum-1];
But this will read the entire file into the array @lines and can take
up a lot of memory if the file is huge. Is there a more efficient
solution?

No, not really. Besides the 'tie' approach (which is sometimes
too slow), you can always read large files fast 'record by record'
(eg. lines) and check the line no via "$." ...

Regards

M.
 
X

xhoster

googler said:
Inside my Perl script, I had to check if a particular pattern appears
in a certain file or not (only a yes/no answer). I did it as below:
@matching_lines = grep { /$srchpat/ } <MYFILE>;

This reads the entire file, even if the match is in the first line.
(Potentially worse, it reads the entire into memory at once, as perl
is currently implemented.)
print "Pattern found\n" if ($#matching_lines != -1);

I was wondering if there is a more efficient way to do this. Is it
possible to use the Unix "grep" command to do this inside my script?

Sure. There are many ways. The simplest, if $srchpat and $filename don't
require protecting from shell interpretation, and $srchpat either doesn't
have special characters or only has ones that mean the same thing between
Perl and grep, would be something like this:

my $result=`grep -l $srchpat $filename`;
If so, how? Will that be more efficient (faster)?

It would have more overhead, but will probably run faster once it gets
running (provided $srchpat is fairly simple)
I have another question. Is there a way to read a particular line in a
file when I know the line number (without using a loop and reading
each line at a time)? I guess the below code would work.
@lines = <MYFILE>;
$myline = $lines[$linenum-1];
But this will read the entire file into the array @lines and can take
up a lot of memory if the file is huge. Is there a more efficient
solution?

Unless you know how long each line is, or have otherwise pre-computed some
kind of index into the file, you need to read the entire file at least up
to the desired line and count newlines, either implicitly or explicitly.

Xho
 
G

googler

I have another question. Is there a way to read a particular line in a
file when I know the line number (without using a loop and reading
each line at a time)? I guess the below code would work.
@lines = <MYFILE>;
$myline = $lines[$linenum-1];
But this will read the entire file into the array @lines and can take
up a lot of memory if the file is huge. Is there a more efficient
solution?

Unless you know how long each line is, or have otherwise pre-computed some
kind of index into the file, you need to read the entire file at least up
to the desired line and count newlines, either implicitly or explicitly.

OK, say I know how long each line is. How can it help in reading the n-
th line from the file directly? Can you please explain. Thanks.
 
P

Peter Makholm

googler said:
OK, say I know how long each line is. How can it help in reading the n-
th line from the file directly? Can you please explain. Thanks.

If you know that each line, including newline, is $x bytes long, you
can read line $n by doing something like:

use Fcntl :)seek);

open FH, '>', $filename;
seek FH, $n*$x, SEEK_SET;
$_ = <FH>;

Note that the length is in bytes, not characters. So doing this on an
utf8 encoded file (or any other variable length encoding) will not
work as expected.

//Makholm
 
X

xhoster

googler said:
I have another question. Is there a way to read a particular line in
a file when I know the line number (without using a loop and reading
each line at a time)? I guess the below code would work.
@lines = <MYFILE>;
$myline = $lines[$linenum-1];
But this will read the entire file into the array @lines and can take
up a lot of memory if the file is huge. Is there a more efficient
solution?

Unless you know how long each line is, or have otherwise pre-computed
some kind of index into the file, you need to read the entire file at
least up to the desired line and count newlines, either implicitly or
explicitly.

OK, say I know how long each line is. How can it help in reading the n-
th line from the file directly? Can you please explain. Thanks.

Compute where in the file the desired line starts, and use "seek" to
jump to it. See perldoc -f seek.

Xho
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top