regexp

pop · May 9, 2006

Hi folks,

I solved a query I asked you folks sometime back. Here is a small bit
thats again confusing
So here is the log line that I am trying to take info out

log line:

M 2006-05-03 10:20 +0000 mahesh 1.4 testfile.c com/avs/cpar ==
/test/com/avs/cpar

Here is the regexp
elsif ($line =~ /(^[M]) \s* ([^+]+) + [0]+ \s* ([\w]+)\s* ([^+]+) \s*
([^\w]+)\s* ([\w]+) \s* == \s* ([^. ]+) / )
{
print
"<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";

The above gives me fairly all the variables that I want except goofing
up in one location, ie,,
that is the $4 location. It gives
$4 = 1.4 client.c com/avs
$6 = cpar
I want
$4 = 1.4
$5 = com/avs/cpar
$6 = /test/com/avs/cpar

okay can someone take a look and tell me where I am going wrong.

cheers,
pop.

Josef Moellers · May 9, 2006

pop said:
Hi folks,

I solved a query I asked you folks sometime back. Here is a small bit
thats again confusing
So here is the log line that I am trying to take info out

log line:

M 2006-05-03 10:20 +0000 mahesh 1.4 testfile.c com/avs/cpar ==
/test/com/avs/cpar

Here is the regexp
elsif ($line =~ /(^[M]) \s* ([^+]+) + [0]+ \s* ([\w]+)\s* ([^+]+) \s*
([^\w]+)\s* ([\w]+) \s* == \s* ([^. ]+) / )
{
print
"<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";

The above gives me fairly all the variables that I want except goofing
up in one location, ie,,
that is the $4 location. It gives
$4 = 1.4 client.c com/avs
$6 = cpar
I want
$4 = 1.4
$5 = com/avs/cpar
$6 = /test/com/avs/cpar

okay can someone take a look and tell me where I am going wrong.

For one, I do not see where the "client.c" should come from.
Also, the code you post does not really compile.
Please post a small, complete program that demonstrates your problem.

I also have a problem with the blanks in your regex. As you post it, it
won't match the line given, as it would need two blanks after the
initial M, a blank between the + and the 0 (of 0000) ...

pop · May 9, 2006

Hi

Here is the source code and , sorry about the blank spaces. I am pretty
new to this and still on the learning ladder.

Source code :
@buf = @_;
for ( $i=0; ($i <= $#buf); $i++ ) {
$line = @buf[$i];
if
($line=~/^[N][o]\s*[r][e][c][o][r][d]\s*[e][l][e][c][t][e][d]/) {
print "<tr><td> Bad Query </td><tr>;
}
elsif ($line =~
/(^[M])\s*([^+]+)[+][0]+\s*([\w]+)\s*([^+]+)\s*([^\w]+)\s*([\w]+)
\s*==\s*([^. ]+)/ ){
print
<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";
}
}
}

The output I am trying to regex is

M 2005-08-07 18:13 +0000 avenda 1.4 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:33 +0000 joe 1.2 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:37 +0000 joe 1.3 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas

Sorry I gave a baad example previously The client.c is contained here
in this output.

thanks
pop.

David Squire · May 9, 2006

pop said:
Hi

Here is the source code and , sorry about the blank spaces. I am pretty
new to this and still on the learning ladder.

Source code :
@buf = @_;
for ( $i=0; ($i <= $#buf); $i++ ) {
$line = @buf[$i];

foreach my $line (@buf) {

if
($line=~/^[N][o]\s*[r][e][c][o][r][d]~~\s*~~[e][l][e][c][t][e][d]/) {~~~~

if ($line =~ /^No\s+records\s+selected/) { # No point in single
character character classes, and I bet you require at least one space
between words.

print "<tr><td> Bad Query </td><tr>;
}
elsif ($line =~
/(^[M])\s*([^+]+)[+][0]+\s*([\w]+)\s*([^+]+)\s*([^\w]+)\s*([\w]+)
\s*==\s*([^. ]+)/ ){
print
<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";
}
}
}

The output I am trying to regex is

M 2005-08-07 18:13 +0000 avenda 1.4 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:33 +0000 joe 1.2 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:37 +0000 joe 1.3 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas

Sorry I gave a baad example previously The client.c is contained here
in this output.

Click to expand...

What is the desired out output? There is little point showing use the
input without the corresponding desired output.

DS

rajeev · May 9, 2006

Hey mahesh:

I think you are making regex more complicated. Please see the sample
below:

====

if ($line =~ /^M.*?\+\d+ ([^\W]+) ([\d.]+) ([\w\._]+) (.*?) == (.*)/)
{
print "4 : $2 \n";
print "5 : $4 \n";
print "6 : $5 \n";
}

====

David Squire · May 9, 2006

rajeev said:
Hey mahesh:

I think you are making regex more complicated. Please see the sample
below:

====

if ($line =~ /^M.*?\+\d+ ([^\W]+) ([\d.]+) ([\w\._]+) (.*?) == (.*)/)
{
print "4 : $2 \n";
print "5 : $4 \n";
print "6 : $5 \n";
}

Please don't top-post. See the posting guidelines for this group.

DS

pop said:
====

pop said:

Hi folks,

I solved a query I asked you folks sometime back. Here is a small bit
thats again confusing
So here is the log line that I am trying to take info out

log line:

M 2006-05-03 10:20 +0000 mahesh 1.4 testfile.c com/avs/cpar ==
/test/com/avs/cpar

Here is the regexp
elsif ($line =~ /(^[M]) \s* ([^+]+) + [0]+ \s* ([\w]+)\s* ([^+]+) \s*
([^\w]+)\s* ([\w]+) \s* == \s* ([^. ]+) / )
{
print
"<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";

The above gives me fairly all the variables that I want except goofing
up in one location, ie,,
that is the $4 location. It gives
$4 = 1.4 client.c com/avs
$6 = cpar
I want
$4 = 1.4
$5 = com/avs/cpar
$6 = /test/com/avs/cpar

okay can someone take a look and tell me where I am going wrong.

cheers,
pop.

Click to expand...

pop · May 9, 2006

Hi folks,

thanks for the assist. I can use substring to extract with space
delimiter.
(knocks his head with the keyboard). I can do that instead of going the
regex way.

Its more simpler.
Need to upgrade my brain.

cheers
pop.

Josef Moellers · May 10, 2006

pop said:
Hi

Here is the source code and , sorry about the blank spaces. I am pretty
new to this and still on the learning ladder.

Source code :
@buf = @_;
for ( $i=0; ($i <= $#buf); $i++ ) {
$line = @buf[$i];
if
($line=~/^[N][o]\s*[r][e][c][o][r][d]\s*[e][l][e][c][t][e][d]/){
print "<tr><td> Bad Query </td><tr>;
}
elsif ($line =~
/(^[M])\s*([^+]+)[+][0]+\s*([\w]+)\s*([^+]+)\s*([^\w]+)\s*([\w]+)
\s*==\s*([^. ]+)/ ){
print
<tr><td>$6</td><td>$4</td><td>$3</td><td>$2</td><td>$1</td></tr>";
}
}
}

The output I am trying to regex is

M 2005-08-07 18:13 +0000 avenda 1.4 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:33 +0000 joe 1.2 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:37 +0000 joe 1.3 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas

Sorry I gave a baad example previously The client.c is contained here
in this output.

David has already commented on various issues such as the single
charachter character classes (hmm, sounds funny ...)

I find long regexes very confusing, maybe you could do better by just
checking for the initial 'M', then split()-ting the line on multiple
blanks and checking the various fields, e.g.

elsif ($line =~ /^M/) {
my @f = split(/\s+/, $line);
# Maybe add some more checks here, e.g. $f[3] eq '+0000'
# then use whatever you need from @f
}

Josef Moellers · May 10, 2006

pop said:
Hi folks,

thanks for the assist. I can use substring to extract with space
delimiter.
(knocks his head with the keyboard). I can do that instead of going the
regex way.

Please include some context when replying.

Thank you.

GU · May 10, 2006

pop said:
/(^[M])\s*([^+]+)[+][0]+\s*([\w]+)\s*([^+]+)\s*([^\w]+)\s*([\w]+)
\s*==\s*([^. ]+)/ ){ ....
The output I am trying to regex is

M 2005-08-07 18:13 +0000 avenda 1.4 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:33 +0000 joe 1.2 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
M 2005-08-07 17:37 +0000 joe 1.3 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas

may you will use ([\d\.]+) to match $4. within your example $4 are
numbers.
i would have used something more explained code like

@a=();
push @a, 'M'; # start
push @a, '[\d-]{10}\s[\d:]{5}'; # date time
push @a, '\+\d+'; # timezone
push @a, '\w+'; # user
push @a, '[\d\.]+'; # version
push @a, '[\w\.]+'; # program name
push @a, '[\w\d/]+'; # short
push @a, '=='; # delimiter
push @a, '[\w\d/]+'; # link
$pattern=join '\s+', map {"($_)"} @a;
print $pattern,"\n";
if (m#$pattern#) { print "$_ $$_\n" foreach 1..9; };

Gerhard

p.s.
output of those script:
$ perl -n test.pl
M 2005-08-07 18:13 +0000 avenda 1.4 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
(M)\s+([\d-]{10}\s[\d:]{5})\s+(\+\d+)\s+(\w+)\s+([\d\.]+)\s+([\w\.]+)\s+([\w\d/]+)\s+(==)\s+([\w\d/]+)
1 M
2 2005-08-07 18:13
3 +0000
4 avenda
5 1.4
6 client.c
7 com/cavs/mabas
8 ==
9 /art/cvsbase/com/cavs/mabas
M 2005-08-07 17:33 +0000 joe 1.2 client.c com/cavs/mabas ==
/art/cvsbase/com/cavs/mabas
(M)\s+([\d-]{10}\s[\d:]{5})\s+(\+\d+)\s+(\w+)\s+([\d\.]+)\s+([\w\.]+)\s+([\w\d/]+)\s+(==)\s+([\w\d/]+)
1 M
2 2005-08-07 17:33
3 +0000
4 joe
5 1.2
6 client.c
7 com/cavs/mabas
8 ==
9 /art/cvsbase/com/cavs/mabas

When I send email as HTML, why do erroneous whitespaces getintroduced to the HTML source and a few <	2	Nov 8, 2013
Taskcproblem calendar	4	Aug 31, 2023
I need help fixing my website	2	Oct 15, 2023
Google appscript	0	Oct 5, 2024
Adding 'download' column to existing 'visitors' table (as requested)	18	Nov 6, 2013
Help with code	0	Jun 12, 2022
If call method HTML file not create	1	May 27, 2010
Regexp Ruby selection	5	Jul 25, 2008

regexp

pop

Josef Moellers

pop

David Squire

rajeev

David Squire

pop

Josef Moellers

Josef Moellers

GU

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads