data manipulation

Bob · Oct 9, 2003

I have a log file i want to further manipulate, and be able to extract info
from.

(the actual output data is from qmail-qread command )

the format of the file is as below

6 Oct 2003 14:01:12 GMT #23456 12345 <[email protected]>
remote (e-mail address removed)
done (e-mail address removed)
6 Oct .....

the format is always date, line, followed by 1 or more info lines, that are
either \t or " " indented.

I want to grap each of these "chunks" and then run a regex and output them
when I get a match.

So, using the data I have, I want to read in the (3 lines in this case, but
can 2 to ???? lines) and then output the whole block if I match a regex.

If anyone can recommend some place to start reading It would be appreciated.

Gunnar Hjalmarsson · Oct 9, 2003

Bob said:
I have a log file i want to further manipulate, and be able to
extract info from.

(the actual output data is from qmail-qread command )

the format of the file is as below

6 Oct 2003 14:01:12 GMT #23456 12345 <[email protected]>
remote (e-mail address removed)
done (e-mail address removed)
6 Oct .....

the format is always date, line, followed by 1 or more info lines,
that are either \t or " " indented.

I want to grap each of these "chunks" and then run a regex and
output them when I get a match.

So, using the data I have, I want to read in the (3 lines in this
case, but can 2 to ???? lines) and then output the whole block if I
match a regex.

If anyone can recommend some place to start reading It would be
appreciated.

http://learn.perl.org/

I'm quite sure that you were able to figure that out yourself, though,
and that you actually wanted somebody to write some code, without
having given it a try yourself first.

I'd be surprised if there
weren't better ways, but this is one possible approach:

#!/usr/bin/perl
use strict;
use warnings;

my ($chunk, @chunks);
open FH, 'logfile' or die $!;
while (<FH>) {
if (/^\S/) {
push @chunks, $chunk if $chunk;
$chunk = '';
}
$chunk .= $_;
}
close FH;
push @chunks, $chunk;

# print chunks that include the domain fake.com
print grep { /\@fake\.com/ } @chunks;

Bob · Oct 10, 2003

Gunnar Hjalmarsson said:
http://learn.perl.org/

I'm quite sure that you were able to figure that out yourself, though,
and that you actually wanted somebody to write some code, without
having given it a try yourself first. I'd be surprised if there
weren't better ways, but this is one possible approach:

#!/usr/bin/perl
use strict;
use warnings;

my ($chunk, @chunks);
open FH, 'logfile' or die $!;
while (<FH>) {
if (/^\S/) {
push @chunks, $chunk if $chunk;
$chunk = '';
}
$chunk .= $_;
}
close FH;
push @chunks, $chunk;

# print chunks that include the domain fake.com
print grep { /\@fake\.com/ } @chunks;

Thanks for the assistance, I was able to use most of what you offered with
what I already had.

I am new to Perl, and after reading, there are only 2 lines I don't clearly
understand

-> push @chunks, $chunk if $chunk;
-> $chunk = '';

After reading the push function description from "learning Perl" I am
failing to understand exactly what is happening here.

B

Tad McClellan · Oct 10, 2003

Bob said:
news:[email protected]...

[ Snip a whole bunch of lines not needed for context. ]

there are only 2 lines I don't clearly
understand

-> push @chunks, $chunk if $chunk;

Or, if you like/understand it better written this way:

if ( $chunk ) {
push @chunks, $chunk;
}

-> $chunk = '';

After reading the push function description from "learning Perl" I am
failing to understand exactly what is happening here.

if $chunk is a true value, then the value of $chunk gets tacked onto
the end of the @chunks array (ie. @chunks gets one element larger).

The second line clears out the accumulator for the next go-round.

Gunnar Hjalmarsson · Oct 10, 2003

Bob said:
-> push @chunks, $chunk if $chunk;
-> $chunk = '';

After reading the push function description from "learning Perl" I
am failing to understand exactly what is happening here.

It adds an element to the array @chunks with what's been stored in
$chunk from previous iterations, and then it empties $chunk.

The line

push @chunks, $chunk if $chunk;

can also be written

if ($chunk) {
push @chunks, $chunk;
}

My suggestion means that the whole log file ends up in memory in the
array @chunks. I thought that made the code easier to understand, but
it should be noted that if the log file is really big, it's not a good
approach. In that case, you'd better do in the loop with respective
'generation' of $chunk whatever you want to do with it, and refrain
from storing the whole file in an array.

HTH

Bob · Oct 10, 2003

Gunnar Hjalmarsson said:
It adds an element to the array @chunks with what's been stored in
$chunk from previous iterations, and then it empties $chunk.

The line

push @chunks, $chunk if $chunk;

can also be written

if ($chunk) {
push @chunks, $chunk;
}

My suggestion means that the whole log file ends up in memory in the
array @chunks. I thought that made the code easier to understand, but
it should be noted that if the log file is really big, it's not a good
approach. In that case, you'd better do in the loop with respective
'generation' of $chunk whatever you want to do with it, and refrain
from storing the whole file in an array.

HTH

Thanks for the explanation, And I already understood the implication of the
whole file being in memory. In this case, it is not a problem, the log is
generally ony 3-5Meg but never over 50Meg.

For those interested, I have included the full script below

Regards,

B

#!/usr/bin/perl
#
# NAME qmail-qreadto
# purpose - look thru the qmail-qread logs for particular messages
#
use strict;
use warnings;

if (@ARGV == "0" ) {
print "\n\tqmail-qreadto \{ email to search for\} \n";
print "\t\texample\: qmail-qreadto me\@example.net \n\n";
exit 0
}

my ($Log, @Logs);
my $file1 = "/var/log/qmail-qread";
my $file2 = "/var/log/qmail-qread1";

if (-e $file2 ) {
open (MyFILE, $file2);
} else {
open (MyFILE, $file1);
}

while (<MyFILE>) {
if (/^\d{1,2}/) {
push @Logs, $Log if $Log;
$Log = '';
}
$Log .= $_;
}

close MyFILE;

push @Logs, $Log;

print grep {/@ARGV/} @Logs;

exit 0;

Gunnar Hjalmarsson · Oct 10, 2003

Bob said:
Thanks for the explanation, And I already understood the
implication of the whole file being in memory. In this case, it is
not a problem, the log is generally ony 3-5Meg but never over
50Meg.

I'm glad to be able to help, Bob. My apologies for being assumptive in
my first reply - you were obviously ready to put more effort in it
than I first thought. ;-)

For those interested, I have included the full script below

Looks good to me.

Tad McClellan · Oct 10, 2003

[ snip yet another full-quote. Please learn to quote followups properly ]

if (@ARGV == "0" ) {

Why use a numeric operator and then force stringification?

If you want to test a number, use a number and a numeric operator:

if (@ARGV == 0 ) {

If you want to test a string, use a string and a string operator

if (@ARGV eq "0" ) {

print "\n\tqmail-qreadto \{ email to search for\} \n";

^ ^
^ ^

A useless use of backslashes.

print "\t\texample\: qmail-qreadto me\@example.net \n\n";
exit 0

You should exit with a *non* zero value when an error occurs.

open (MyFILE, $file2);

You should always, yes *always*, check the return value from open():

open(MyFILE, $file2) or die "could not open '$file2' $!";

nobull · Oct 10, 2003

Bob said:
Ok, what I am trying to test for is the existance of an argument, - if
@ARGV is NULL .... do something..... what I use works, but if there is a
better way of testing for a NULL than a numerical comparison?

Yep, simple boolean context.

unless (@ARGV) {

suggestions.... In this case, where the array is empty, a numeric test
returns "0", so the test made sense.

No, I cannot see any reason why it never makes sense to use a string
literal in a numeric context. (Of course the compiler optomises the
string-to-number convertions out).

In this case, I know that 1 of the 2 files will always exist, so I did not
"or die" here, but for the sake of consistency I will add it.

No it's not just for consistency. It's a good programming habit so
that if something sometime in the future breaks you get a useful
error.

John W. Krahn · Oct 10, 2003

Bob said:
Thanks for the explanation, And I already understood the implication of the
whole file being in memory. In this case, it is not a problem, the log is
generally ony 3-5Meg but never over 50Meg.

For those interested, I have included the full script below

#!/usr/bin/perl
#
# NAME qmail-qreadto
# purpose - look thru the qmail-qread logs for particular messages
#
use strict;
use warnings;

if (@ARGV == "0" ) {

That is usually written as:

unless ( @ARGV ) {

print "\n\tqmail-qreadto \{ email to search for\} \n";
print "\t\texample\: qmail-qreadto me\@example.net \n\n";
exit 0
}

my ($Log, @Logs);
my $file1 = "/var/log/qmail-qread";
my $file2 = "/var/log/qmail-qread1";

if (-e $file2 ) {
open (MyFILE, $file2);
} else {
open (MyFILE, $file1);
}

What happens if the file is unlinked or moved between the time you stat
the file and the time you open the file? What happens if neither $file1
nor $file2 exists? This is better:

open MyFILE, $file2
or open MyFILE, $file1
or die "Cannot open $file2 or $file1: $!";

while (<MyFILE>) {
if (/^\d{1,2}/) {
push @Logs, $Log if $Log;
$Log = '';
}
$Log .= $_;
}

close MyFILE;

push @Logs, $Log;

print grep {/@ARGV/} @Logs;

The match operator m// interpolates its contents just like a double
quoted string so if you have more then one element in @ARGV this will
not work. You probably want:

print grep /\Q$ARGV[0]/, @Logs;

Or you could test the contents of $Log before you push it into @Logs:

push @Logs, $Log if $Log =~ /\Q$ARGV[0]/;

exit 0;

John

Scrap data from pdf file to excel using python	0	Jun 21, 2023
EEG stream data with mne and brainfolw	0	Jul 26, 2023
Collect Excel Data from Website	5	Apr 30, 2022
byte data manipulation	14	Apr 25, 2009
PDF extraction of specific data	1	Jun 13, 2021
Send Var through html(data)	2	Feb 29, 2020
How to reliably determine paths of active apache .conf files from within php	2	Jul 27, 2022
Please help with C programming to save GPS reception data in Raspberry Pi.	0	Dec 8, 2022

data manipulation

Bob

Gunnar Hjalmarsson

Bob

Tad McClellan

Gunnar Hjalmarsson

Bob

Gunnar Hjalmarsson

Tad McClellan

nobull

John W. Krahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads