extract parts of file - newbie

J

jason

Hello. New to Perl and trying to figure out if beter way to do the
following (in Active State Perl under Windows 2000):

I have this DOS text file with about 20,000 lines. In the simple
example below I can extract lines that contain a particular string.

$db = "work.txt";
open (FILE,"$db");
@LINES=<FILE>;
close(FILE);
$SIZE=@LINES;
print $SIZE,"\n";
for ($i=0;$i<=$SIZE;$i++)
{
$_=$LINES[$i];
if (/motion/i)
{print "$_";}
}


How can I extract:

1. 5 lines before and after the string
2. Columns positions 5-15 (for all selected)
3. Limit selection to rows 5000-7000
4. The last 5 lines of the entire file

Many Thanks for any help or information!!
 
G

Gunnar Hjalmarsson

In the simple example below I can extract lines that contain a
particular string.

$db = "work.txt";
open (FILE,"$db");
@LINES=<FILE>;
close(FILE);
$SIZE=@LINES;
print $SIZE,"\n";
for ($i=0;$i<=$SIZE;$i++)
{
$_=$LINES[$i];
if (/motion/i)
{print "$_";}
}

How can I extract:

1. 5 lines before and after the string
2. Columns positions 5-15 (for all selected)
3. Limit selection to rows 5000-7000
4. The last 5 lines of the entire file

By using your fantasy and possibly learning a little more Perl.

What have you tried so far? What difficulties did you encounter that
you weren't able to solve by help of the documentation and the FAQ?
 
J

John W. Krahn

Hello. New to Perl and trying to figure out if beter way to do the
following (in Active State Perl under Windows 2000):

I have this DOS text file with about 20,000 lines. In the simple
example below I can extract lines that contain a particular string.

$db = "work.txt";
open (FILE,"$db");
@LINES=<FILE>;
close(FILE);
$SIZE=@LINES;
print $SIZE,"\n";
for ($i=0;$i<=$SIZE;$i++)
{
$_=$LINES[$i];
if (/motion/i)
{print "$_";}
}

A more Perl-ish version of that would be:

use warnings;
use strict;

my $db = 'work.txt';
open FILE, $db or die "Cannot open $db: $!";
my @lines = <FILE>;
close FILE;
print @lines . "\n";
for ( @lines )
{
print if /motion/i;
}

How can I extract:

1. 5 lines before and after the string

for my $i ( 0 .. $#lines )
{
print @lines[ $i - 5 .. $i + 5 ] if /motion/i;
}

2. Columns positions 5-15 (for all selected)

for ( @lines )
{
print substr $_, 4, 11 if /motion/i;
}

3. Limit selection to rows 5000-7000

for ( @lines[ 4999 .. 6999 ] )
{
print if /motion/i;
}

4. The last 5 lines of the entire file

for ( @lines[ $#lines - 5 .. $#lines ] )
{
print if /motion/i;
}



John
 
D

David K. Wall

I have this DOS text file with about 20,000 lines.
[snip]

How can I extract:

1. 5 lines before and after the string

Search the Google usenet archives; you'll find a number of solutions. In
particular, see the thread starting with a post by Tom Christiansen,
message ID (e-mail address removed). (I searched for the words "before
after lines match" (but not as a phrase), and TC's thread was the first
match. Lots of other hits, too.)
2. Columns positions 5-15 (for all selected)

perldoc -f substr
3. Limit selection to rows 5000-7000

Check out $. in perlvar and read the section on range operators in perlop.
4. The last 5 lines of the entire file

Left as an exercise... :)
 
T

Tad McClellan

New to Perl


We can tell that from the code. :)

open (FILE,"$db");


You should not quote a lone variable.

You should always, yes *always*, check the return value from open():

open(FILE, $db) or die "could not open '$db' $!";

@LINES=<FILE>;
close(FILE);
$SIZE=@LINES;
print $SIZE,"\n";
for ($i=0;$i<=$SIZE;$i++)
{
$_=$LINES[$i];


Phew!

Don't read it ALL into memory only to process it line-by-line,
just read and process a line at a time.

If you do that, you can replace that whole chunk of code with just this:

if (/motion/i)
{print "$_";}
^ ^
^ ^ more useless quotes, remove them
}


How can I extract:

1. 5 lines before and after the string


Oh. _Now_ you might want them all in an array. :)

foreach my $index ( $i - 5 .. $i + 5 ) {
print $LINES[$index];
}

Or you could use an "array slice" (see perldata.pod):

print @LINES[ $i - 5 .. $i + 5 ];


What do you want to do if the matched line is in the first or
last 5 lines? ...

You could still process line-by-line if you maintained a 5-line buffer
of the previous lines.

2. Columns positions 5-15 (for all selected)


print substr($_, 4, 11), "\n" if /motion/i;

3. Limit selection to rows 5000-7000


my @selected = @LINES[ 5000 .. 7000 ];

4. The last 5 lines of the entire file


print @LINES[ $#LINES-4 .. $#LINES ];
 
J

Jay Tilton

(e-mail address removed) wrote:

: Hello. New to Perl and trying to figure out if beter way to do the
: following (in Active State Perl under Windows 2000):
:
: I have this DOS text file with about 20,000 lines. In the simple
: example below I can extract lines that contain a particular string.
:
: $db = "work.txt";
: open (FILE,"$db");
: @LINES=<FILE>;
: close(FILE);
: $SIZE=@LINES;
: print $SIZE,"\n";
: for ($i=0;$i<=$SIZE;$i++)
: {
: $_=$LINES[$i];
: if (/motion/i)
: {print "$_";}
: }

You've committed several novice mistakes there.
1. Using package variables instead of lexicals.
2. Quoting "$vars"
3. Not checking the return from open() for success.
4. Slurping an entire file to perform line-by-line processing.
5. Iterating across an array's indeces instead of iterating across its
elements.

Sequentially processing the records in a file is such a common task that
you should learn the Perlish way of doing it.

my $db = "work.txt";
open (FILE, '<', $db) or die "Cannot open '$db' for read:$!";
while(<FILE>) {
print if /motion/i;
}

: How can I extract:
:
: 1. 5 lines before and after the string

Store the previous five lines in an array. When your program recognizes
the desired record, have it output the contents of this buffer and note
that it should output the next five records.

: 2. Columns positions 5-15 (for all selected)

The substr() function will do that. See perlfunc.

: 3. Limit selection to rows 5000-7000

The '..' range operator is imbued with special juju for that purpose.
See perlop.

: 4. The last 5 lines of the entire file

The buffer implemented for requirement 1 can be made to handle that as
well.

Altogether, the program might go like:

#!perl
use warnings;
use strict;
my $db = 'work.txt';
open my $fh, '<', $db or die "Cannot open '$db' for read: $!";
my (@w, $n);
while (<$fh>) {
push @w, substr($_,5,11); # requirement 2
$n=6
if 5000 .. 7000 # requirement 3
and /motion/;
if($n) {
print @w;
@w = ();
$n--;
}
else {
splice @w, 0, -5; # limit window to five previous records
}
}
print @w; # requirement 4
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,374
Latest member
anuragag27

Latest Threads

Top