split on a string that should be interpreted as a regex

K

Klaus

Hello everybody,

I need the wisdom of the perl community.

I have a perl program that uses the split function as follows:

my ($tag, $value) = split(":\s*", $_, 2);

I want to split on ':', followed by optional whitespaces.

When I run the program, I get a warning

Unrecognized escape \s passed through at test.pl line 228.

and the split is performed on the string ":\s", and NOT on the regular
expression /:\s+/.

I was thinking that perl (at least perl 5.8) interprets the first
parameter of the split function always as a regular expression, even
if it is a simple string, i.e. if I say:

split(":\s*", $_, 2)

then perl interprets this automatically and without warning into

split(/:\s*/, $_, 2)

Is my thinking correct ? under perl 5.8 ? under perl 5.10 ?

Unfortunately I have no perl 5.8 available to test split(":\s*", $_,
2); under perl 5.8

I am using perl -v
This is perl, v5.10.1 built for MSWin32-x64-multi-thread
 
K

Klaus

I was thinking that perl (at least perl 5.8) interprets the first
parameter of the split function always as a regular expression, even
if it is a simple string, i.e. if I say:

split(":\s*", $_, 2)

then perl interprets this automatically and without warning into

split(/:\s*/, $_, 2)

[responding to my own post]

I think I've got egg on my face:

perl (that's all of perl 5.8, 5.10 and 5.12) in fact interprets the
first parameter of the split function always as a regular expression,
even if it is a string. The reason I got confused is that a double-
quoted string, of course, undergoes the usual backslash escapes before
it is passed to the split function.

In my case, the correct split would be either with double-backslashes:

split(":\\s*", $_, 2)

or I use single quotes:

split(':\s*', $_, 2)
 
K

Klaus

No, the *correct* split would use slashes:
  split(/:\s*/, $_, 2);

Agreed, using slashes is the way to use split, and this is how I have
coded it now.

Thanks.
 
X

Xho Jingleheimerschmidt

Klaus said:
Hello everybody,

I need the wisdom of the perl community.

I have a perl program that uses the split function as follows:

my ($tag, $value) = split(":\s*", $_, 2);

I want to split on ':', followed by optional whitespaces.

When I run the program, I get a warning

Unrecognized escape \s passed through at test.pl line 228.

That warning seems bizarre to me. It is the literal s, and not the
escape, that got "passed through".

Xho
 
A

Alan Curry

No it's not. First the string undergoes qq expansion: this sees "\s",
doesn't recognize it, and passes it through unchanged (with a warning).
The result, ':\s*', is then passed to the regex compiler by split, which
quite happily interprets \s as 'space'.

Really? Not for me:

$ perl -wle 'print length "\s"'
Unrecognized escape \s passed through at -e line 1.
1
$ perl -wle 'print for split "\s", "Mississippi"'
Unrecognized escape \s passed through at -e line 1.
Mi

i

ippi
$
The fact this is so confusing is a very good reason to avoid
double-quoting the split pattern in the first place :).

That part is true.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,810
Latest member
Kassie0918

Latest Threads

Top