data manipulation

B

Bob

I have a log file i want to further manipulate, and be able to extract info
from.

(the actual output data is from qmail-qread command )

the format of the file is as below

6 Oct 2003 14:01:12 GMT #23456 12345 <[email protected]>
remote (e-mail address removed)
done (e-mail address removed)
6 Oct .....

the format is always date, line, followed by 1 or more info lines, that are
either \t or " " indented.

I want to grap each of these "chunks" and then run a regex and output them
when I get a match.

So, using the data I have, I want to read in the (3 lines in this case, but
can 2 to ???? lines) and then output the whole block if I match a regex.

If anyone can recommend some place to start reading It would be appreciated.
 
G

Gunnar Hjalmarsson

Bob said:
I have a log file i want to further manipulate, and be able to
extract info from.

(the actual output data is from qmail-qread command )

the format of the file is as below

6 Oct 2003 14:01:12 GMT #23456 12345 <[email protected]>
remote (e-mail address removed)
done (e-mail address removed)
6 Oct .....

the format is always date, line, followed by 1 or more info lines,
that are either \t or " " indented.

I want to grap each of these "chunks" and then run a regex and
output them when I get a match.

So, using the data I have, I want to read in the (3 lines in this
case, but can 2 to ???? lines) and then output the whole block if I
match a regex.

If anyone can recommend some place to start reading It would be
appreciated.

http://learn.perl.org/

I'm quite sure that you were able to figure that out yourself, though,
and that you actually wanted somebody to write some code, without
having given it a try yourself first. :( I'd be surprised if there
weren't better ways, but this is one possible approach:

#!/usr/bin/perl
use strict;
use warnings;

my ($chunk, @chunks);
open FH, 'logfile' or die $!;
while (<FH>) {
if (/^\S/) {
push @chunks, $chunk if $chunk;
$chunk = '';
}
$chunk .= $_;
}
close FH;
push @chunks, $chunk;

# print chunks that include the domain fake.com
print grep { /\@fake\.com/ } @chunks;
 
B

Bob

Gunnar Hjalmarsson said:
http://learn.perl.org/

I'm quite sure that you were able to figure that out yourself, though,
and that you actually wanted somebody to write some code, without
having given it a try yourself first. :( I'd be surprised if there
weren't better ways, but this is one possible approach:

#!/usr/bin/perl
use strict;
use warnings;

my ($chunk, @chunks);
open FH, 'logfile' or die $!;
while (<FH>) {
if (/^\S/) {
push @chunks, $chunk if $chunk;
$chunk = '';
}
$chunk .= $_;
}
close FH;
push @chunks, $chunk;

# print chunks that include the domain fake.com
print grep { /\@fake\.com/ } @chunks;

Thanks for the assistance, I was able to use most of what you offered with
what I already had.

I am new to Perl, and after reading, there are only 2 lines I don't clearly
understand

-> push @chunks, $chunk if $chunk;
-> $chunk = '';

After reading the push function description from "learning Perl" I am
failing to understand exactly what is happening here.


B
 
T

Tad McClellan

Bob said:


[ Snip a whole bunch of lines not needed for context. ]

there are only 2 lines I don't clearly
understand

-> push @chunks, $chunk if $chunk;


Or, if you like/understand it better written this way:

if ( $chunk ) {
push @chunks, $chunk;
}

-> $chunk = '';

After reading the push function description from "learning Perl" I am
failing to understand exactly what is happening here.


if $chunk is a true value, then the value of $chunk gets tacked onto
the end of the @chunks array (ie. @chunks gets one element larger).

The second line clears out the accumulator for the next go-round.
 
G

Gunnar Hjalmarsson

Bob said:
-> push @chunks, $chunk if $chunk;
-> $chunk = '';

After reading the push function description from "learning Perl" I
am failing to understand exactly what is happening here.

It adds an element to the array @chunks with what's been stored in
$chunk from previous iterations, and then it empties $chunk.

The line

push @chunks, $chunk if $chunk;

can also be written

if ($chunk) {
push @chunks, $chunk;
}

My suggestion means that the whole log file ends up in memory in the
array @chunks. I thought that made the code easier to understand, but
it should be noted that if the log file is really big, it's not a good
approach. In that case, you'd better do in the loop with respective
'generation' of $chunk whatever you want to do with it, and refrain
from storing the whole file in an array.

HTH
 
B

Bob

Gunnar Hjalmarsson said:
It adds an element to the array @chunks with what's been stored in
$chunk from previous iterations, and then it empties $chunk.

The line

push @chunks, $chunk if $chunk;

can also be written

if ($chunk) {
push @chunks, $chunk;
}

My suggestion means that the whole log file ends up in memory in the
array @chunks. I thought that made the code easier to understand, but
it should be noted that if the log file is really big, it's not a good
approach. In that case, you'd better do in the loop with respective
'generation' of $chunk whatever you want to do with it, and refrain
from storing the whole file in an array.

HTH

Thanks for the explanation, And I already understood the implication of the
whole file being in memory. In this case, it is not a problem, the log is
generally ony 3-5Meg but never over 50Meg.

For those interested, I have included the full script below

Regards,

B

#!/usr/bin/perl
#
# NAME qmail-qreadto
# purpose - look thru the qmail-qread logs for particular messages
#
use strict;
use warnings;


if (@ARGV == "0" ) {
print "\n\tqmail-qreadto \{ email to search for\} \n";
print "\t\texample\: qmail-qreadto me\@example.net \n\n";
exit 0
}

my ($Log, @Logs);
my $file1 = "/var/log/qmail-qread";
my $file2 = "/var/log/qmail-qread1";

if (-e $file2 ) {
open (MyFILE, $file2);
} else {
open (MyFILE, $file1);
}

while (<MyFILE>) {
if (/^\d{1,2}/) {
push @Logs, $Log if $Log;
$Log = '';
}
$Log .= $_;
}

close MyFILE;

push @Logs, $Log;

print grep {/@ARGV/} @Logs;

exit 0;
 
G

Gunnar Hjalmarsson

Bob said:
Thanks for the explanation, And I already understood the
implication of the whole file being in memory. In this case, it is
not a problem, the log is generally ony 3-5Meg but never over
50Meg.

I'm glad to be able to help, Bob. My apologies for being assumptive in
my first reply - you were obviously ready to put more effort in it
than I first thought. ;-)
For those interested, I have included the full script below

Looks good to me.
 
T

Tad McClellan

[ snip yet another full-quote. Please learn to quote followups properly ]

if (@ARGV == "0" ) {


Why use a numeric operator and then force stringification?

If you want to test a number, use a number and a numeric operator:

if (@ARGV == 0 ) {

If you want to test a string, use a string and a string operator

if (@ARGV eq "0" ) {

print "\n\tqmail-qreadto \{ email to search for\} \n";
^ ^
^ ^

A useless use of backslashes.

print "\t\texample\: qmail-qreadto me\@example.net \n\n";
exit 0


You should exit with a *non* zero value when an error occurs.

open (MyFILE, $file2);


You should always, yes *always*, check the return value from open():

open(MyFILE, $file2) or die "could not open '$file2' $!";
 
N

nobull

Bob said:
Ok, what I am trying to test for is the existance of an argument, - if
@ARGV is NULL .... do something..... what I use works, but if there is a
better way of testing for a NULL than a numerical comparison?

Yep, simple boolean context.

unless (@ARGV) {
suggestions.... In this case, where the array is empty, a numeric test
returns "0", so the test made sense.

No, I cannot see any reason why it never makes sense to use a string
literal in a numeric context. (Of course the compiler optomises the
string-to-number convertions out).
In this case, I know that 1 of the 2 files will always exist, so I did not
"or die" here, but for the sake of consistency I will add it.

No it's not just for consistency. It's a good programming habit so
that if something sometime in the future breaks you get a useful
error.
 
J

John W. Krahn

Bob said:
Thanks for the explanation, And I already understood the implication of the
whole file being in memory. In this case, it is not a problem, the log is
generally ony 3-5Meg but never over 50Meg.

For those interested, I have included the full script below

#!/usr/bin/perl
#
# NAME qmail-qreadto
# purpose - look thru the qmail-qread logs for particular messages
#
use strict;
use warnings;

if (@ARGV == "0" ) {

That is usually written as:

unless ( @ARGV ) {

print "\n\tqmail-qreadto \{ email to search for\} \n";
print "\t\texample\: qmail-qreadto me\@example.net \n\n";
exit 0
}

my ($Log, @Logs);
my $file1 = "/var/log/qmail-qread";
my $file2 = "/var/log/qmail-qread1";

if (-e $file2 ) {
open (MyFILE, $file2);
} else {
open (MyFILE, $file1);
}

What happens if the file is unlinked or moved between the time you stat
the file and the time you open the file? What happens if neither $file1
nor $file2 exists? This is better:

open MyFILE, $file2
or open MyFILE, $file1
or die "Cannot open $file2 or $file1: $!";

while (<MyFILE>) {
if (/^\d{1,2}/) {
push @Logs, $Log if $Log;
$Log = '';
}
$Log .= $_;
}

close MyFILE;

push @Logs, $Log;

print grep {/@ARGV/} @Logs;

The match operator m// interpolates its contents just like a double
quoted string so if you have more then one element in @ARGV this will
not work. You probably want:

print grep /\Q$ARGV[0]/, @Logs;

Or you could test the contents of $Log before you push it into @Logs:

push @Logs, $Log if $Log =~ /\Q$ARGV[0]/;



John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,141
Messages
2,570,814
Members
47,360
Latest member
kathdev

Latest Threads

Top