Regex matching question

  • Thread starter Alexander Stremitzer
  • Start date
A

Alexander Stremitzer

I want to be able to break a string into 2 parts. One part should be
everything before "/", the second one everything after "/" or empty if
there is no "/".
Here is some sample input data and what I would expect.
"SKULL"
"SKULL",""
"SKULL /LEFT"
"SKULL","LEFT"
"SKULL /FULL /LEFT"
"SKULL","FULL /LEFT"

I have tried with pattern matching but can't get it to work. Any help is
appreciated.

$full_desc = "SKULL";
$full_desc =~ m/(\w*)\/(\w*)/;
print "string1: $1\n";
print "string2: $2\n";
 
J

John W. Krahn

Alexander said:
I want to be able to break a string into 2 parts. One part should be
everything before "/", the second one everything after "/" or empty if
there is no "/".
Here is some sample input data and what I would expect.
"SKULL"
"SKULL",""
"SKULL /LEFT"
"SKULL","LEFT"
"SKULL /FULL /LEFT"
"SKULL","FULL /LEFT"

I have tried with pattern matching but can't get it to work. Any help is
appreciated.


$ perl -le'
for ( "SKULL", "SKULL /LEFT", "SKULL /FULL /LEFT" ) {
my ( $one, $two ) = m| (\w+) \s* /? \s* (\b.*\b) |x;
print qq/ "$one", "$two" /;
}
'
"SKULL", ""
"SKULL", "LEFT"
"SKULL", "FULL /LEFT"



John
 
S

Sam Holden

I want to be able to break a string into 2 parts. One part should be
everything before "/", the second one everything after "/" or empty if
there is no "/".
Here is some sample input data and what I would expect.
"SKULL"
"SKULL",""
"SKULL /LEFT"
"SKULL","LEFT"
"SKULL /FULL /LEFT"
"SKULL","FULL /LEFT"

Your data doesn't match you description. Everything before "/" includes
the space but your sample data does not..

my ($string1, $string2) = split m|\s*/|, $full_desc, 2;
I have tried with pattern matching but can't get it to work. Any help is
appreciated.

$full_desc = "SKULL";
$full_desc =~ m/(\w*)\/(\w*)/;
print "string1: $1\n";
print "string2: $2\n";

Clearly that regex requires a / character to appear in the string,
violating your problem description. You should also never use $1 and $2
(and the rest) variables unless you know the regex has succeeded.

And \w isn't what your description said: \w doesn't match "everything".
And doesn't match '/' which your description sample data claims should
be matched.
 
A

Alexander Stremitzer

Sam said:
Your data doesn't match you description. Everything before "/" includes
the space but your sample data does not..

my ($string1, $string2) = split m|\s*/|, $full_desc, 2;
You are correct, I overlooked the space. Sorry for my sloppy
description. Your solution does exactly what I wanted. Thanks.

I am not sure what the "split m|\s*/|" does compared to "split |\s*/|".
I was reading the paragraph below at
http://www.perl.com/pub/a/2002/06/04/apo5.html?page=9

But it's vitally important to understand this fundamental change, that
// is no longer a short form of m//, but rather a short form of rx//. If
you want to add modifiers to a //, you have to turn it into an rx//, not
an m//. It's now wrong to call split like this:

split m/.../

(That is, it's wrong unless you actually want the return value of the
pattern match to be used as the literal split delimiter.)
Clearly that regex requires a / character to appear in the string,
violating your problem description. You should also never use $1 and $2
(and the rest) variables unless you know the regex has succeeded.
Appreciate the hint. It caused me to rewrite some of my existing code.
And \w isn't what your description said: \w doesn't match "everything".
And doesn't match '/' which your description sample data claims should
be matched.
Again correct. But I did not know how to match against evrything except
/. So I used \w as a first attempt.
 
B

Ben Morrow

Alexander Stremitzer said:
I am not sure what the "split m|\s*/|" does compared to "split |\s*/|".

'split |\s*/|' is a syntax error, 'split m|\s*/|' is not. :)

/.../ is equivalent to m/.../; this is *only* true if the delimiters
are //. See perldoc perlop "Quote and Quote-like Operators".
I was reading the paragraph below at
http://www.perl.com/pub/a/2002/06/04/apo5.html?page=9

But it's vitally important to understand this fundamental change, that
// is no longer a short form of m//, but rather a short form of rx//. If
you want to add modifiers to a //, you have to turn it into an rx//, not
an m//. It's now wrong to call split like this:

split m/.../

(That is, it's wrong unless you actually want the return value of the
pattern match to be used as the literal split delimiter.)

Whoa there... this is about Perl6. Don't go reading that if you don't
understand Perl5 yet :). Perl6 is the next version of the Perl
language; a usable version of it is unlikely to be released for
several years.

What this is saying is that what in Perl5 (ie. current Perl) is
written

split m|\s*/|

will in Perl6 be written

split rx|\s*/|

.. But this does not need to concern you :).
Again correct. But I did not know how to match against evrything except
/. So I used \w as a first attempt.

The general answer to this is [^/] (a character class matching
everything except '/'); but this is not usually the answer you are
looking for. It is nearly always better to use .*? somewhere instead.

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,142
Messages
2,570,819
Members
47,367
Latest member
mahdiharooniir

Latest Threads

Top