regexp

P

pop

Hi folks,

I solved a query I asked you folks sometime back. Here is a small bit
thats again confusing
So here is the log line that I am trying to take info out

log line:

M 2006-05-03 10:20 +0000 mahesh 1.4 testfile.c com/avs/cpar ==
/test/com/avs/cpar

Here is the regexp
elsif ($line =~ /(^[M]) \s* ([^+]+) + [0]+ \s* ([\w]+)\s* ([^+]+) \s*
([^\w]+)\s* ([\w]+) \s* == \s* ([^. ]+) / )
{
print
"<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";


The above gives me fairly all the variables that I want except goofing
up in one location, ie,,
that is the $4 location. It gives
$4 = 1.4 client.c com/avs
$6 = cpar
I want
$4 = 1.4
$5 = com/avs/cpar
$6 = /test/com/avs/cpar

okay can someone take a look and tell me where I am going wrong.

cheers,
pop.
 
J

Josef Moellers

pop said:
Hi folks,

I solved a query I asked you folks sometime back. Here is a small bit
thats again confusing
So here is the log line that I am trying to take info out

log line:

M 2006-05-03 10:20 +0000 mahesh 1.4 testfile.c com/avs/cpar ==
/test/com/avs/cpar

Here is the regexp
elsif ($line =~ /(^[M]) \s* ([^+]+) + [0]+ \s* ([\w]+)\s* ([^+]+) \s*
([^\w]+)\s* ([\w]+) \s* == \s* ([^. ]+) / )
{
print
"<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";


The above gives me fairly all the variables that I want except goofing
up in one location, ie,,
that is the $4 location. It gives
$4 = 1.4 client.c com/avs
$6 = cpar
I want
$4 = 1.4
$5 = com/avs/cpar
$6 = /test/com/avs/cpar

okay can someone take a look and tell me where I am going wrong.

For one, I do not see where the "client.c" should come from.
Also, the code you post does not really compile.
Please post a small, complete program that demonstrates your problem.

I also have a problem with the blanks in your regex. As you post it, it
won't match the line given, as it would need two blanks after the
initial M, a blank between the + and the 0 (of 0000) ...
 
P

pop

Hi

Here is the source code and , sorry about the blank spaces. I am pretty
new to this and still on the learning ladder.

Source code :
@buf = @_;
for ( $i=0; ($i <= $#buf); $i++ ) {
$line = @buf[$i];
if
($line=~/^[N][o]\s*[r][e][c][o][r][d]\s*[e][l][e][c][t][e][d]/) {
print "<tr><td> Bad Query </td><tr>;
}
elsif ($line =~
/(^[M])\s*([^+]+)[+][0]+\s*([\w]+)\s*([^+]+)\s*([^\w]+)\s*([\w]+)
\s*==\s*([^. ]+)/ ){
print
<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";
}
}
}

The output I am trying to regex is

M 2005-08-07 18:13 +0000 avenda 1.4 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:33 +0000 joe 1.2 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:37 +0000 joe 1.3 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas

Sorry I gave a baad example previously The client.c is contained here
in this output.

thanks
pop.
 
D

David Squire

pop said:
Hi

Here is the source code and , sorry about the blank spaces. I am pretty
new to this and still on the learning ladder.

Source code :
@buf = @_;
for ( $i=0; ($i <= $#buf); $i++ ) {
$line = @buf[$i];

foreach my $line (@buf) {
if
($line=~/^[N][o]\s*[r][e][c][o][r][d]\s*[e][l][e][c][t][e][d]/) {


if ($line =~ /^No\s+records\s+selected/) { # No point in single
character character classes, and I bet you require at least one space
between words.
print "<tr><td> Bad Query </td><tr>;
}
elsif ($line =~
/(^[M])\s*([^+]+)[+][0]+\s*([\w]+)\s*([^+]+)\s*([^\w]+)\s*([\w]+)
\s*==\s*([^. ]+)/ ){
print
<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";
}
}
}

The output I am trying to regex is

M 2005-08-07 18:13 +0000 avenda 1.4 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:33 +0000 joe 1.2 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:37 +0000 joe 1.3 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas

Sorry I gave a baad example previously The client.c is contained here
in this output.

What is the desired out output? There is little point showing use the
input without the corresponding desired output.

DS
 
R

rajeev

Hey mahesh:

I think you are making regex more complicated. Please see the sample
below:

====


if ($line =~ /^M.*?\+\d+ ([^\W]+) ([\d.]+) ([\w\._]+) (.*?) == (.*)/)
{
print "4 : $2 \n";
print "5 : $4 \n";
print "6 : $5 \n";
}

====
 
D

David Squire

rajeev said:
Hey mahesh:

I think you are making regex more complicated. Please see the sample
below:

====


if ($line =~ /^M.*?\+\d+ ([^\W]+) ([\d.]+) ([\w\._]+) (.*?) == (.*)/)
{
print "4 : $2 \n";
print "5 : $4 \n";
print "6 : $5 \n";
}

Please don't top-post. See the posting guidelines for this group.

DS
====
pop said:
Hi folks,

I solved a query I asked you folks sometime back. Here is a small bit
thats again confusing
So here is the log line that I am trying to take info out

log line:

M 2006-05-03 10:20 +0000 mahesh 1.4 testfile.c com/avs/cpar ==
/test/com/avs/cpar

Here is the regexp
elsif ($line =~ /(^[M]) \s* ([^+]+) + [0]+ \s* ([\w]+)\s* ([^+]+) \s*
([^\w]+)\s* ([\w]+) \s* == \s* ([^. ]+) / )
{
print
"<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";


The above gives me fairly all the variables that I want except goofing
up in one location, ie,,
that is the $4 location. It gives
$4 = 1.4 client.c com/avs
$6 = cpar
I want
$4 = 1.4
$5 = com/avs/cpar
$6 = /test/com/avs/cpar

okay can someone take a look and tell me where I am going wrong.

cheers,
pop.
 
P

pop

Hi folks,

thanks for the assist. I can use substring to extract with space
delimiter.
(knocks his head with the keyboard). I can do that instead of going the
regex way.

Its more simpler.
Need to upgrade my brain.

cheers
pop.
 
J

Josef Moellers

pop said:
Hi

Here is the source code and , sorry about the blank spaces. I am pretty
new to this and still on the learning ladder.

Source code :
@buf = @_;
for ( $i=0; ($i <= $#buf); $i++ ) {
$line = @buf[$i];
if
($line=~/^[N][o]\s*[r][e][c][o][r][d]\s*[e][l][e][c][t][e][d]/){
print "<tr><td> Bad Query </td><tr>;
}
elsif ($line =~
/(^[M])\s*([^+]+)[+][0]+\s*([\w]+)\s*([^+]+)\s*([^\w]+)\s*([\w]+)
\s*==\s*([^. ]+)/ ){
print
<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";
}
}
}

The output I am trying to regex is

M 2005-08-07 18:13 +0000 avenda 1.4 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:33 +0000 joe 1.2 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:37 +0000 joe 1.3 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas

Sorry I gave a baad example previously The client.c is contained here
in this output.


David has already commented on various issues such as the single
charachter character classes (hmm, sounds funny ...)

I find long regexes very confusing, maybe you could do better by just
checking for the initial 'M', then split()-ting the line on multiple
blanks and checking the various fields, e.g.

elsif ($line =~ /^M/) {
my @f = split(/\s+/, $line);
# Maybe add some more checks here, e.g. $f[3] eq '+0000'
# then use whatever you need from @f
}
 
J

Josef Moellers

pop said:
Hi folks,

thanks for the assist. I can use substring to extract with space
delimiter.
(knocks his head with the keyboard). I can do that instead of going the
regex way.

Please include some context when replying.

Thank you.
 
G

GU

pop said:
/(^[M])\s*([^+]+)[+][0]+\s*([\w]+)\s*([^+]+)\s*([^\w]+)\s*([\w]+)
\s*==\s*([^. ]+)/ ){ ....
The output I am trying to regex is

M 2005-08-07 18:13 +0000 avenda 1.4 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:33 +0000 joe 1.2 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:37 +0000 joe 1.3 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas

may you will use ([\d\.]+) to match $4. within your example $4 are
numbers.
i would have used something more explained code like

@a=();
push @a, 'M'; # start
push @a, '[\d-]{10}\s[\d:]{5}'; # date time
push @a, '\+\d+'; # timezone
push @a, '\w+'; # user
push @a, '[\d\.]+'; # version
push @a, '[\w\.]+'; # program name
push @a, '[\w\d/]+'; # short
push @a, '=='; # delimiter
push @a, '[\w\d/]+'; # link
$pattern=join '\s+', map {"($_)"} @a;
print $pattern,"\n";
if (m#$pattern#) { print "$_ $$_\n" foreach 1..9; };

Gerhard

p.s.
output of those script:
$ perl -n test.pl
M 2005-08-07 18:13 +0000 avenda 1.4 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
(M)\s+([\d-]{10}\s[\d:]{5})\s+(\+\d+)\s+(\w+)\s+([\d\.]+)\s+([\w\.]+)\s+([\w\d/]+)\s+(==)\s+([\w\d/]+)
1 M
2 2005-08-07 18:13
3 +0000
4 avenda
5 1.4
6 client.c
7 com/cavs/mabas
8 ==
9 /art/cvsbase/com/cavs/mabas
M 2005-08-07 17:33 +0000 joe 1.2 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
(M)\s+([\d-]{10}\s[\d:]{5})\s+(\+\d+)\s+(\w+)\s+([\d\.]+)\s+([\w\.]+)\s+([\w\d/]+)\s+(==)\s+([\w\d/]+)
1 M
2 2005-08-07 17:33
3 +0000
4 joe
5 1.2
6 client.c
7 com/cavs/mabas
8 ==
9 /art/cvsbase/com/cavs/mabas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,999
Messages
2,570,246
Members
46,839
Latest member
MartinaBur

Latest Threads

Top