Regex Question

C

Chip

Hi and thanks for your help.

I'm trying to parse the following :

Customer[Smith,John]Type[New]Source[Phone]Phone[5551212]

Since the brackets are metacharacters in perl I thought I
could just escape them with the following:

if (/Customer\[(.+),(.+)\]/) { # parse out values
$last = $1;
$first = $2;
}

But this does not work. How should I escape the '[' and ']'
characters

Thanks
Chip
 
P

Pedro Graca

Chip said:
Since the brackets are metacharacters in perl I thought I
could just escape them with the following:

if (/Customer\[(.+),(.+)\]/) { # parse out values


if (/Customer\[([^,]+),([^\]]+)\]/) { # parse out values
$last = $1;
$first = $2;
}

But this does not work. How should I escape the '[' and ']'
characters

Your problem is that the second (.+) matches "John]Type[..."
 
G

Glenn Jackman

Chip said:
Customer[Smith,John]Type[New]Source[Phone]Phone[5551212]

Since the brackets are metacharacters in perl I thought I
could just escape them with the following:

if (/Customer\[(.+),(.+)\]/) { # parse out values
$last = $1;
$first = $2;
}

But this does not work. How should I escape the '[' and ']'
characters

It does work, just not as you'd like. Your subexpressions are too
greedy.

my $s = 'Customer[Smith,John]Type[New]Source[Phone]Phone[5551212]';
if ($s =~ /Customer\[(.+),(.+)\]/) {print "$1\n$2\n"}

results in:
Smith
John]Type[New]Source[Phone]Phone[5551212

Try
/Customer\[(.+?),(.+?)\]/)


You might also try:
my %data = split /[][]/, $s;
my ($last, $first) = split /,/, $data{Customer};
 
C

Chip

Thanks Glenn,

that works perfect

Chip
Glenn Jackman said:
Chip said:
Customer[Smith,John]Type[New]Source[Phone]Phone[5551212]

Since the brackets are metacharacters in perl I thought I
could just escape them with the following:

if (/Customer\[(.+),(.+)\]/) { # parse out values
$last = $1;
$first = $2;
}

But this does not work. How should I escape the '[' and ']'
characters

It does work, just not as you'd like. Your subexpressions are too
greedy.

my $s = 'Customer[Smith,John]Type[New]Source[Phone]Phone[5551212]';
if ($s =~ /Customer\[(.+),(.+)\]/) {print "$1\n$2\n"}

results in:
Smith
John]Type[New]Source[Phone]Phone[5551212

Try
/Customer\[(.+?),(.+?)\]/)


You might also try:
my %data = split /[][]/, $s;
my ($last, $first) = split /,/, $data{Customer};
 
J

John W. Krahn

Chip said:
I'm trying to parse the following :

Customer[Smith,John]Type[New]Source[Phone]Phone[5551212]

Since the brackets are metacharacters in perl I thought I
could just escape them with the following:

if (/Customer\[(.+),(.+)\]/) { # parse out values
$last = $1;
$first = $2;
}

But this does not work. How should I escape the '[' and ']'
characters

You could always use a hash and do something like this:

$ perl -e'use Data::Dumper;
my $string = "Customer[Smith,John]Type[New]Source[Phone]Phone[5551212]";
my %data = map {
my $x;
/,/ ? { map { ( $x++ ? "First" : "Last" ), $_ } split /,/, $_, 2 }
: $_
} split /[\[\]]/, $string;
print Dumper( \%data );
'
$VAR1 = {
'Customer' => {
'Last' => 'Smith',
'First' => 'John'
},
'Phone' => '5551212',
'Source' => 'Phone',
'Type' => 'New'
};


:)

John
 
B

Bill

Glenn Jackman said:
my $s = 'Customer[Smith,John]Type[New]Source[Phone]Phone[5551212]';

You might also try:
my %data = split /[][]/, $s;
my ($last, $first) = split /,/, $data{Customer};

I am a bit surprised by this, since I would have written the regex
bracket expression like this:

/[\]\[]/

And would like to know why your way does not split on two empty
character classes?
Is this just Perl DWIW?
 
G

Glenn Jackman

Bill said:
Glenn Jackman said:
my %data = split /[][]/, $s;

I am a bit surprised by this, since I would have written the regex
bracket expression like this:

/[\]\[]/

And would like to know why your way does not split on two empty
character classes?

Because inside a character class, [ is not special, and if ] is the
first character (or second char if you want to negate the char.class
with ^), then it's the literal ] and not the end of the class.
 
B

Ben Morrow

Glenn Jackman said:
my $s = 'Customer[Smith,John]Type[New]Source[Phone]Phone[5551212]';

You might also try:
my %data = split /[][]/, $s;
my ($last, $first) = split /,/, $data{Customer};

I am a bit surprised by this, since I would have written the regex
bracket expression like this:

/[\]\[]/

And would like to know why your way does not split on two empty
character classes?
Is this just Perl DWIW?

Well... if ] is the first thing after a [, it is taken a part of the
class rather than as closing it. It wouldn't work with, say, /[[]]/,
which matches [ followed by ].

Ben
 
M

max

How do I deal with escaped backslash in patterns?
Hope the is some light out there...
I want to match (dublequote anything dublequote) followd by equels
like ("something"=something)

here are some of the problem pattern

"=\"=\"=\""="=\"=\"=\"" "\\\\\\\\\\"="\\\\\\\\\\" "==="="==="

This is what I have so far.
I just don't seem to be able to match the no \ before "= unless the are two \\
I seem to be able to do one pattern but not all of them!

m/
^" #begining of line duble quote
(.*? #anything minimul match
)
(?:(?<!\\) #( exept before a back slash
"=) # a duble quote followed by a equal sign)
(.*) #anything after
/x ;
or
m/
^" #begining of line duble quote
(.*? #anything minimul match
)
(?: #exept before a back slash
"=) #a duble quote folowed by a equal sign
(.*) #anything after
/x ;

Sorry about the unclarity of the mess
Thanks
Max
 
B

Brian McCauley

How do I deal with escaped backslash in patterns?

This is covered in the FAQ "How can I split a [character] delimited
string except when inside [character]?"

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
 
H

Hobbit HK

How do I deal with escaped backslash in patterns?
Hope the is some light out there...
I want to match (dublequote anything dublequote) followd by equels
like ("something"=something)

here are some of the problem pattern

"=\"=\"=\""="=\"=\"=\"" "\\\\\\\\\\"="\\\\\\\\\\" "==="="==="

This is what I have so far.
I just don't seem to be able to match the no \ before "= unless the are two \\
I seem to be able to do one pattern but not all of them!

m/
^" #begining of line duble quote
(.*? #anything minimul match
)
(?:(?<!\\) #( exept before a back slash
"=) # a duble quote followed by a equal sign)
(.*) #anything after
/x ;
or
m/
^" #begining of line duble quote
(.*? #anything minimul match
)
(?: #exept before a back slash
"=) #a duble quote folowed by a equal sign
(.*) #anything after
/x ;

Sorry about the unclarity of the mess
Thanks
Max


If I understood you right, you want to match some thnig like ("X"=X)?
So I think you need to use:
m/\("(.*?)"=\1\)/

Tell me if I misunderstood...
 
M

max

If I understood you right, you want to match some thnig like ("X"=X)?
So I think you need to use:
m/\("(.*?)"=\1\)/

Tell me if I misunderstood...

Sorry, it is not so clear, I will try again...
I am trying to split something a paten like "anything"=something_else
but with the rule that if the (anything or something_else) has a " or
\ inside
it is escaped with a \,
So (any\thing) would become (any\\thing), and (anyth"ing) would become
(anyth\"ing).
So I think am trying to match anything(not \\ or \")then"=(and some
more anything).
Or to put it another way [^\]\""= and [^\]\\"= are ok, [^(not even
number of backslash)\"= is not as you get an escaped " .

Maybe I am looking at this from the wrong angel!
I did have a look at:
"How can I split a [character] delimited
string except when inside [character]?"
Seems to be something like m/"([^"\\]*(\\.[^"\\]*)*)"|([^,]+)/g
m/
"
(
[^"\\]* #not [" or \] 0 or more times
(\\. [^"\\]*) #\ one chr not [a " or \]0 or more times
*) # 0 or more times
"
| #or
(
[^,]+) #not 1 or more ,
/g;
I don't see how this helps and the line in the faq is no better at
explaning, Sorry
Max
 
M

max

A few sleepless nights,
I think this is what I was looking for!
Which is checking the are zero or more multiples of two backslash
without any other backslash behind the field delimitor, which in this
case is the duble quote equels character pair.
Don't know if it is the best way of doing this, let me know if you
have any other ideas, for dealing with escape characters?

Thanks anyway
Max

(.*?)((?:[^\\](?:[\\]{2})*)"=)(.*)
which should split any patten with an escape backslash into $1 "= $3.
If I understood you right, you want to match some thnig like ("X"=X)?
So I think you need to use:
m/\("(.*?)"=\1\)/

Tell me if I misunderstood...

Sorry, it is not so clear, I will try again...
I am trying to split something a paten like "anything"=something_else
but with the rule that if the (anything or something_else) has a " or
\ inside
it is escaped with a \,
So (any\thing) would become (any\\thing), and (anyth"ing) would become
(anyth\"ing).
So I think am trying to match anything(not \\ or \")then"=(and some
more anything).
Or to put it another way [^\]\""= and [^\]\\"= are ok, [^(not even
number of backslash)\"= is not as you get an escaped " .

Maybe I am looking at this from the wrong angel!
I did have a look at:
"How can I split a [character] delimited
string except when inside [character]?"
Seems to be something like m/"([^"\\]*(\\.[^"\\]*)*)"|([^,]+)/g
m/
"
(
[^"\\]* #not [" or \] 0 or more times
(\\. [^"\\]*) #\ one chr not [a " or \]0 or more times
*) # 0 or more times
"
| #or
(
[^,]+) #not 1 or more ,
/g;
I don't see how this helps and the line in the faq is no better at
explaning, Sorry
Max
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

RegEx 0
Register Question 0
regex question 7
Quotemeta & Regex question re-posted as plain text 1
Regex help 2
regex question 6
Very simple regex question 2
Simple regex question 1

Members online

Forum statistics

Threads
474,142
Messages
2,570,820
Members
47,367
Latest member
mahdiharooniir

Latest Threads

Top