regex problem

M

maheshpop1

Hi folks,

A little regex problem, need a lil help with the solution. I tried and
might have missed out something here.

STRING:
O 2005-06-14 14:43 +0000 pop commain/com/testdata =All/com/testdata=
/home/user/temp/All/testdata/

Here is the regex
if ($line =~
/(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\s]*([^
]+)/ ))

Iam trying to capture the data in the paranthesis. however it doesnt
seem to work.
So guys anyone can you just evaluate this. perhaps a fresh look might
capture something that i missed.

cheers
POP.
 
J

Jürgen Exner

STRING:
O 2005-06-14 14:43 +0000 pop commain/com/testdata =All/com/testdata=
/home/user/temp/All/testdata/ [...]
Iam trying to capture the data in the paranthesis.

I don't see any paranthesis in the your sample data.

jue
 
M

maheshpop1

Oh sorry,

Here is the sample data with the paranthesis.

STRING:

(O) (2005-06-14) 14:43 +0000 (pop) (commain/com/testdata)
=(All/com/testdata)=
(/home/user/temp/All/testdata/)

Here is the regex
if ($line =~
/(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\s]*([^
]+)/ ))

cheers
POP
 
P

Paul Lalli

Hi folks,

A little regex problem, need a lil help with the solution. I tried and
might have missed out something here.

STRING:
O 2005-06-14 14:43 +0000 pop commain/com/testdata =All/com/testdata=
/home/user/temp/All/testdata/

Here is the regex
if ($line =~
/(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\s]*([^
]+)/ ))


That regular exprssion is insanely unreadable. WHY on earth is every
single token inside a character class? I see exactly four of those [ ]
that are actually needed - the ones that start with a ^. Get rid of
every other [ ] in that regular expression, and then reformat it using
the /x modifier and some prudent use of whitespace. If you haven't
figured out your problem after that, post the modified version.

Paul Lalli
 
N

niall.macpherson

Here is the regex
if ($line =~
/(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\s]*([^
]+)/ ))

I don't understand why you have all the open / close square brackets.

To match any number of spaces you just need \s* , not [\s]*

It's not clear exactly what you want since you have only posted 1
possible line of data. If you just want to extract everything between
parentheses and you don't know how many sets of parentheses there are,
the following should work (provided there are no nested parentheses)

#--------------------------------------------------------------------------------
use strict;
use warnings;
use Data::Dumper;


my $teststr = '(O) (2005-06-14) 14:43 +0000 (pop)
(commain/com/testdata)';
$teststr .= ' =(All/com/testdata)= ';
$teststr .= ' (/home/user/temp/All/testdata/)';

my @results = ();
while ($teststr =~ /\((.*?)\)/g)
{
push @results , $1;
}
print Dumper @results;

#-----------------------------------------------

C:\develop\NiallPerlScripts>clpm16.pl
$VAR1 = 'O';
$VAR2 = '2005-06-14';
$VAR3 = 'pop';
$VAR4 = 'commain/com/testdata';
$VAR5 = 'All/com/testdata';
$VAR6 = '/home/user/temp/All/testdata/';


If you want a regex that matches though you will have to define more
specifically what the data looks like

Hope this helps
 
M

maheshpop1

Hi

Its actually multiple lines of data like the below and I need to
extract the ones which I manually highlighted inthe parantheses in the
first string as an example..

STRING ex:
(O) (2005-06-14) 14:43 +0000 (pop) (commain/com/testdata)
=(All/com/testdata)= (/home/user/temp/All/testdata/)
M 2005-06-14 14:43 +0000 pop commain/com/testdata =All/com/testdata=
/home/user/temp/All/testdata/
E 2005-06-14 14:43 +0000 pop commain/com/testdata =All/com/testdata=
/home/user/temp/All/testdata/

thanks Niall for the tip . I reformatted the regex

if($line=~(^(O|E|M) \s* ([^+]+) +0+ \s* (\w+) \s* ([^=]+) \s*= ([^=]+)
\s*=\s* ([^ ]+)/ ))

Some one kindly point me to a good regex tutorial with samples.
This actually reads a text file full of the above files and extracts
the relevant (data in paranethesis in the first example line).

thanks for assist guys.
pop
 
C

Csaba

(e-mail address removed) wrote in
Oh sorry,

Here is the sample data with the paranthesis.

STRING:

(O) (2005-06-14) 14:43 +0000 (pop) (commain/com/testdata)
=(All/com/testdata)=
(/home/user/temp/All/testdata/)

Here is the regex
if ($line =~
/(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\
s]*([^
]+)/ ))

cheers
POP

Maybe you should try Text::Balanced, especially extract_bracketed

http://search.cpan.org/~dconway/Text-Balanced-1.97/lib/Text/Balanced.pm
 
X

Xicheng Jia

David said:
Buy the book "regular expressions", by Friedl, pub by O'Reilly,
gotten much cheaper from www.bookpool.com.
It is "the" reference!

it should be: "Mastering Regular Expressions", 2nd Edition By Jeffrey
E. F. Friedl.
Publisher : O'Reilly
Pub Date : July 2002
ISBN : 0-596-00289-0
Pages : 484

this is the "bible" to knowing the regex engine behind.

Also, there are several very nice papers about using regex in the
following book:
"Computer Science & Perl Programming: Best of TPJ", by Jon Orwant

Xicheng :)
 
A

Alan J. Flavell

On Mon, 28 May 2006, Xicheng Jia wrote:

[...]
Also, there are several very nice papers about using regex in the
following book: "Computer Science & Perl Programming: Best of TPJ",
by Jon Orwant

In addition to the book recommendations, I'd recommend getting the
PCRE (perl-compatible regular expressions) package, including its
"pcretest" facility, and using it to play around with regexes and
patterns on-line. http://www.pcre.org/
 
D

DJ Stunks

Csaba said:
(e-mail address removed) wrote in
Oh sorry,

Here is the sample data with the paranthesis.

STRING:

(O) (2005-06-14) 14:43 +0000 (pop) (commain/com/testdata)
=(All/com/testdata)=
(/home/user/temp/All/testdata/)

Here is the regex
if ($line =~
/(^[O])[\s]*([^+]+)[+][0]+[\s]*([\w]+)[\s]*([^=]+)[\s]*=([^=]+)[\s]*=[\
s]*([^
]+)/ ))

cheers
POP

Maybe you should try Text::Balanced, especially extract_bracketed

http://search.cpan.org/~dconway/Text-Balanced-1.97/lib/Text/Balanced.pm

This is really just speculation, but I don't think his real data has
the parentheses in it; I think he just put those in to show what fields
he wanted collected...

-jp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,818
Latest member
SapanaCarpetStudio

Latest Threads

Top