A little Direction Please

Andy · May 13, 2008

Greets

Q; I am trying to learn how to define some variables

The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins

Then create an array ( I hope the right terminology) to seperate it

I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server

but ...hey step by step right.

Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -

This field 226 0 - is a successful login

My plan is to scrub the logs, export to file.

sort fields into variable.

I hope in the end to get

1..log of successful logins
2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)
3 be able to parse the fields and get data.

I know that there are those of you who are advanced, I would
appreciate any directions or help.

Again I am trying to put this together this is what I have so far.

#!/usr/bin/perl
use strict;
use warnings;

open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");
my $extractedLine;
while (<INPUT>) {
my $line = $_;
if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
print OUTPUT "$1\n";
}
}
close(INPUT);
close(OUTPUT);
exit;

Ben Morrow · May 13, 2008

Quoth Andy said:
Greets

Q; I am trying to learn how to define some variables

The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins

Then create an array ( I hope the right terminology) to seperate it

I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server

but ...hey step by step right.

Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -

What are these fields separated by? A single space? Can the fields ever
contain spaces? How are they quoted in that case? What about newlines?

This field 226 0 - is a successful login

My plan is to scrub the logs, export to file.

sort fields into variable.

I hope in the end to get

1..log of successful logins
2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)
3 be able to parse the fields and get data.

I know that there are those of you who are advanced, I would
appreciate any directions or help.

Again I am trying to put this together this is what I have so far.

#!/usr/bin/perl
use strict;
use warnings;

open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");

3-arg open: good.
Checking the return value: good.
It's better to keep filehandles in variables than use the old-fashioned
global handles, though; and if the open fails you should say what
failed, and why:

open(my $INPUT, '<', "ex080120.log")
or die("can't read ex080120.log: $!");
open(my $OUTPUT, '>', "ftpacct.log")
or die("can't write ftpacct.log: $!);

my $extractedLine;
while (<INPUT>) {
my $line = $_;

This is silly. If you want the line in $line, put it there in the first
place:

while (my $line = said:
if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
print OUTPUT "$1\n";
}

I would recommend splitting the line into a hash first, and then
selecting lines based on that. Something like

my @fields = qw/
date time c_ip
cs_username cs_method cs_uri_stem
sc_status sc_bytes cs_host
/;

while (my $line = <$INPUT>) {

# Here I assume fields are delimited by a single space, and
# spaces and newlines *never* appear in a field (not even inside
# quotes). If this isn't true, you probably want to use the
# Text::CSV_XS module, which can parse all sorts of
# <foo>-delimited files.

my %record;
@record{@fields} = split / /, $line;

$record{sc_status} == 226
and $record{sc_bytes} == 0
and $record{cs_host} eq '-'
or next;

print $OUTPUT $line;
}

Once you've understood that bit of code it should be straightforward to
change it to do something more sophisticated. To keep track of the last
login for any given user, you need a hash %lastlogin, keyed by username,
that lives outside the loop.

}
close(INPUT);
close(OUTPUT);

An advantage of keeping filehandles in variables is that they are closed
for you when the variable goes out of scope. An advantage of real
operating systems (Win32 counts, here) is that they close filehandles
for you when the process exits, in any case.

That said, there is value in explicitly closing a filehandle opened for
writing, *and checking the return value*. If any of the writes to that
filehandle failed (disk full, for instance) the error will be returned
by close. (Of course, if you want to catch errors sooner than that, you
can check the return value of print instead.)

exit;

There's no need to explicitly exit from a Perl program. Falling off the
end is the usual way to finish.

Ben

Jürgen Exner · May 13, 2008

Andy said:
Q; I am trying to learn how to define some variables

To define a variable in Perl typically you use the assignment operator
'='.

The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins

Then create an array ( I hope the right terminology) to seperate it

I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server

but ...hey step by step right.

Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -

This field 226 0 - is a successful login

My plan is to scrub the logs, export to file.

sort fields into variable.

I hope in the end to get

1..log of successful logins
2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)
3 be able to parse the fields and get data.

I know that there are those of you who are advanced, I would
appreciate any directions or help.

Again I am trying to put this together this is what I have so far.

#!/usr/bin/perl
use strict;
use warnings;

open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");

You might want to add the reason why the open() call failed and the file
name for which it failed.

my $extractedLine;

Why declare a variable that you never use again?

while (<INPUT>) {
my $line = $_;
if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

I know for some people it is difficult to just trust the default
argument. But I would write this as
while (<INPUT>) {
if (m/^(.+226\s+0\s+-\s+.*)$/) {
or

print OUTPUT "$1\n";
}
}
close(INPUT);
close(OUTPUT);

You may want to check the success of the close() call, too, in
particular for a file handle you wrote to.

jue

RedGrittyBrick · May 13, 2008

Andy said:
Greets

Q; I am trying to learn how to define some variables

The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins

Then create an array ( I hope the right terminology) to seperate it

I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server

but ...hey step by step right.

Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -

This field 226 0 - is a successful login

My plan is to scrub the logs, export to file.

sort fields into variable.

perldoc -f split

I hope in the end to get

1..log of successful logins

grep "226 0 - *$" ex*.log > ftpacct.log

perl -n -e 'print if /226 0 - *$/' ex*.log > ftpacct.log

2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)

Logfiles are generally in date order, you just need the last record.

tail -n 1 successful-logins.log > last-successful-login.log

3 be able to parse the fields and get data.

I know that there are those of you who are advanced, I would
appreciate any directions or help.

Again I am trying to put this together this is what I have so far.

#!/usr/bin/perl
use strict;
use warnings;
Good!

open(INPUT, '<', "ex080120.log")or die("Could not open log file.");

Best practise is to ...
- Use lexical filehandles
- Include filename in message
- Include the failure reason in the message

my $filename = 'ex080120.log';
open(my $input, '<', $filename)
or die("Could not open '$filename' because $!");

open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");

see above

my $extractedLine;

Not used? Remove it.

while (<INPUT>) {
my $line = $_;

It's sometimes easier to work with $_ than assign it to another
variable. It would simplify your later code.

if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

Matching ^.+ is wasteful.
You don't need to capture the whole line using ().

print OUTPUT "$1\n";

Unless you chomp your input you'll output an extra blank line.

Putting all the above together

if (/226\s+0\s+-\s*$/) {
print OUTPUT;

OR

print OUTPUT if /\s+0\s+-\s*$/;

Though I'd use lexical filehandles, as I wrote earlier.

print $output if /\s+0\s+-\s*$/;

However to achieve your other aim, use your original construction and add
$last_login = $line;
my ($date, $time, ... $hyphen) = split;
...

}
}
close(INPUT);
close(OUTPUT);

print "last successful login is $last_login";

exit;

Untested, caveat emptor.

Andy · May 13, 2008

Andy said:
Andy said:

Greets

Click to expand...

Q; I am trying to learn how to define some variables

Click to expand...

The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins

Click to expand...

Then create an array ( I hope the right terminology) to seperate it

Click to expand...

I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server

Click to expand...

but ...hey step by step right.

Click to expand...

Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -

Click to expand...

This field 226 0 - is a successful login

Click to expand...

My plan is to scrub the logs, export to file.

Click to expand...

sort fields into variable.

Click to expand...

perldoc -f split

I hope in the end to get

Click to expand...

1..log of successful logins

Click to expand...

grep "226 0 - *$" ex*.log > ftpacct.log

perl -n -e 'print if /226 0 - *$/' ex*.log > ftpacct.log

2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)

Click to expand...

Logfiles are generally in date order, you just need the last record.

tail -n 1 successful-logins.log > last-successful-login.log

3 be able to parse the fields and get data.

Click to expand...

I know that there are those of you who are advanced, I would
appreciate any directions or help.

Click to expand...

Again I am trying to put this together this is what I have so far.

Click to expand...

#!/usr/bin/perl
use strict;
use warnings;
Good!

open(INPUT, '<', "ex080120.log")or die("Could not open log file.");

Click to expand...

Best practise is to ...
- Use lexical filehandles
- Include filename in message
- Include the failure reason in the message

my $filename = 'ex080120.log';
open(my $input, '<', $filename)
or die("Could not open '$filename' because $!");

open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");

Click to expand...

see above

my $extractedLine;

Click to expand...

Not used? Remove it.

while (<INPUT>) {
my $line = $_;

Click to expand...

It's sometimes easier to work with $_ than assign it to another
variable. It would simplify your later code.

if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

Click to expand...

Matching ^.+ is wasteful.
You don't need to capture the whole line using ().

print OUTPUT "$1\n";

Click to expand...

Unless you chomp your input you'll output an extra blank line.

Putting all the above together

if (/226\s+0\s+-\s*$/) {
print OUTPUT;

OR

print OUTPUT if /\s+0\s+-\s*$/;

Though I'd use lexical filehandles, as I wrote earlier.

print $output if /\s+0\s+-\s*$/;

However to achieve your other aim, use your original construction and add
$last_login = $line;
my ($date, $time, ... $hyphen) = split;
...

}
}
close(INPUT);
close(OUTPUT);

Click to expand...

print "last successful login is $last_login";

exit;

Click to expand...

Untested, caveat emptor.

WOW!

Guys you opened my eyes up...I knew there were many ways to do this ,
it is just confusing figuring out which one to use.
I have of course google'd for file manipulations and sorting , I guess
it just takes experience to figure out which is best.

Thanks for the responses, all I have to do is figure out how to take
what you have advised me and try to get it to work.

I think I can safely say " progress in motion".....umm slowly.

I will try your suggestions and see what happens.....

-Thank you again

GREATLY APPRECIATED

Jürgen Exner · May 13, 2008

RedGrittyBrick said:
Matching ^.+ is wasteful.
You don't need to capture the whole line using ().

Unless you chomp your input you'll output an extra blank line.

My first thought, too. However because of the rather 'interesting' way
he is printing the captured group instead of just the plain line he is
loosing the newline in the pattern match. Therefore he has to add it
back explicitely.

print OUTPUT if /\s+0\s+-\s*$/;

Much nicer, of course.

jue

John W. Krahn · May 13, 2008

Ben said:
I would recommend splitting the line into a hash first, and then
selecting lines based on that. Something like

my @fields = qw/
date time c_ip
cs_username cs_method cs_uri_stem
sc_status sc_bytes cs_host
/;

while (my $line = <$INPUT>) {

# Here I assume fields are delimited by a single space, and
# spaces and newlines *never* appear in a field (not even inside
# quotes). If this isn't true, you probably want to use the
# Text::CSV_XS module, which can parse all sorts of
# <foo>-delimited files.

my %record;
@record{@fields} = split / /, $line;

$record{sc_status} == 226
and $record{sc_bytes} == 0
and $record{cs_host} eq '-'

Because you are using "split / /, $line" $record{cs_host} will probably
contain "-\n" instead of '-'.

or next;

print $OUTPUT $line;
}

John

John W. Krahn · May 13, 2008

Jürgen Exner said:
My first thought, too. However because of the rather 'interesting' way
he is printing the captured group instead of just the plain line he is
loosing the newline in the pattern match. Therefore he has to add it
back explicitely.

The \s+ at the end is greedy and will match everything at the end
including the newline unless there is a non-whitespace character after
it that .* will match.

John

Jürgen Exner · May 13, 2008

John W. Krahn said:
The \s+ at the end is greedy and will match everything at the end
including the newline unless there is a non-whitespace character after
it that .* will match.

You are right. I was looking at the trailing .* only and didn't dissect
the RE beyond that.
This RE certainly has some Interesting side effects.

jue

Ben Morrow · May 13, 2008

Quoth "John W. Krahn said:
Because you are using "split / /, $line" $record{cs_host} will probably
contain "-\n" instead of '-'.

Good point. I'm too used to -l

Ben

Tad J McClellan · May 14, 2008

Andy said:
Subject: A little Direction Please

Please put the subject of your article in the Subject of your article.

while (<INPUT>) {
my $line = $_;

If you want the line in $line, then put it in $line rather than
put it somewhere else, only to then copy it to $line:

while ( my $line = <INPUT> ) {

Twitter Bot for Series recommendations help please	1	Oct 2, 2024
Help figuring out a directory permission change problem	1	May 12, 2023
Can't solve problems! please Help	0	Sep 26, 2022
Little direction please Python MySQL	12	Nov 13, 2008
Unicode help please	5	Oct 19, 2013
Dynamic block parsing + scrolling	0	May 30, 2024
Dynamic block parsing + scrolling	0	May 30, 2024
Why is this WordPress comments form not submitting?	1	Jan 12, 2020

A little Direction Please

Andy

Ben Morrow

Jürgen Exner

RedGrittyBrick

Andy

Jürgen Exner

John W. Krahn

John W. Krahn

Jürgen Exner

Ben Morrow

Tad J McClellan

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads