RegEx, how to seperate word from digits?

K

kirknew2Reg

I have a column that contains "suite 111" and "suite222"I need a $
variable containing the word part and aother $ variable containing the
digit part. I have tried variations on this syntax:
(\w*)(\d*)(.*)
(\w*)(\s?)(\d*)(.*)
But nothig I have tried seperates the word from the digits when there
is no space. How do i get 'suite222' to brake in to seperate
variables?
 
S

smallpond

I have a column that contains "suite 111" and "suite222"I need a $
variable containing the word part and aother $ variable containing the
digit part. I have tried variations on this syntax:
(\w*)(\d*)(.*)
(\w*)(\s?)(\d*)(.*)
But nothig I have tried seperates the word from the digits when there
is no space. How do i get 'suite222' to brake in to seperate
variables?


How about: /([[:alpha:]]*)\s*(\d*)/
 
F

Florian Kaufmann

Similar to smallpond's answer, however enforces that the word- and the
digitpart are at least 1 character long. Else, also things like "3"
"x" or even the empty string "" are found.

my ($word,$digit) = /([[:alpha:]]+)\s*(\d+)/;
# do something with $word and $digit
 
K

kirknew2Reg

Similar to smallpond's answer, however enforces that the word- and the
digitpart are at least 1 character long. Else, also things like "3"
"x" or even the empty string "" are found.

my ($word,$digit) = /([[:alpha:]]+)\s*(\d+)/;
# do something with $word and $digit

Thanks for the help. the solution worked.
 
G

Greg Bacon

: I have a column that contains "suite 111" and "suite222"I need a $
: variable containing the word part and aother $ variable containing
: the digit part. I have tried variations on this syntax:
:
: (\w*)(\d*)(.*)
: (\w*)(\s?)(\d*)(.*)
:
: But nothig I have tried seperates the word from the digits when
: there is no space. How do i get 'suite222' to brake in to seperate
: variables?

Another handy trick is the double-negative:

$ cat try
#! /usr/bin/perl

for ("suite 111", "suite222") {
if (/^([^\W\d]+)\s*(\d+)$/) {
print "$_: $1 - $2\n";
}
else {
print "$_: no match\n";
}
}

$ ./try
suite 111: suite - 111
suite222: suite - 222

It's easy to forget that \w matches both alphabetic characters
and numeric characters. (Don't forget about the poor underscore!)

Written out longhand, the pattern [^\W\d] is

NOT [ (NOT a word character) OR (a digit) ]

Apply DeMorgan's theorem to see that this is equivalent to "a
word character that isn't a digit."

Maybe our English teachers didn't know so much after all!

Hope this helps,
Greg
 
D

Dr.Ruud

smallpond schreef:
kirknew2Reg:
I have a column that contains "suite 111" and "suite222"I need a $
variable containing the word part and aother $ variable containing
the digit part. I have tried variations on this syntax:
(\w*)(\d*)(.*)
(\w*)(\s?)(\d*)(.*)
But nothig I have tried seperates the word from the digits when
there is no space. How do i get 'suite222' to brake in to seperate
variables?

How about: /([[:alpha:]]*)\s*(\d*)/

To keep up the POSIX-style:

/([[:alpha:]]+)[[:blank:]]*([[:digit:]]+)/

And [[:blank:]] contains less characters that \s.

And [[:alpha:]] can of course be obscured as [^\W\d_].
 
T

Ted Zlatanov

kirknew2Reg:
I have a column that contains "suite 111" and "suite222"I need a $
variable containing the word part and aother $ variable containing
the digit part. I have tried variations on this syntax:
(\w*)(\d*)(.*)
(\w*)(\s?)(\d*)(.*)
But nothig I have tried seperates the word from the digits when
there is no space. How do i get 'suite222' to brake in to seperate
variables?

How about: /([[:alpha:]]*)\s*(\d*)/

R> To keep up the POSIX-style:

R> /([[:alpha:]]+)[[:blank:]]*([[:digit:]]+)/

R> And [[:blank:]] contains less characters that \s.

R> And [[:alpha:]] can of course be obscured as [^\W\d_].

Just make the \w match non-greedy. The last test case below is
questionable, but the OP didn't specify what to do in that case.

There's also the "match against the reversed string and reverse the
matches" approach :)

Ted

for ("suite 111", "suite222", " 111", "222")
{
if (/^(\w+?)\s*(\d+)$/) # match \w characters conservatively
{
print "$_: $1 - $2\n";
}
else
{
print "$_: no match\n";
}
}

-->
suite 111: suite - 111
suite222: suite - 222
111: no match
222: 2 - 22
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,209
Messages
2,571,086
Members
47,683
Latest member
AustinFairchild

Latest Threads

Top