Reg expression matching issue

F

Faraz.ya

Hi,
I am trying to find a regular expression for 5 consequent numbers like
a zipcode anywhere in a string. For example, the regular expression
should match the following cause they contain 5 consequent numbers
somewhere:

aaa, bbbb, 46546
34344
aaaaaaaaa 67674
ajksdhajks 67675 asdhjkasd

but it shouldn't match the following:

asdjkas hdjasdhsajkd
7674367346
33
asdjkas 343

I tried using the following but it didn't work
myString.matches( "\\d{5}"))

Can somebody help me please
Thanks,
Ross
 
K

Knute Johnson

Hi,
I am trying to find a regular expression for 5 consequent numbers like
a zipcode anywhere in a string. For example, the regular expression
should match the following cause they contain 5 consequent numbers
somewhere:

aaa, bbbb, 46546
34344
aaaaaaaaa 67674
ajksdhajks 67675 asdhjkasd

but it shouldn't match the following:

asdjkas hdjasdhsajkd
7674367346
33
asdjkas 343

I tried using the following but it didn't work
myString.matches( "\\d{5}"))

Can somebody help me please
Thanks,
Ross

import java.util.regex.*;

public class test2 {
public static void main(String[] args) {
String str = ";alsdlkajsdf 32344 lkls3 3323 llsdfsdf";
Pattern p = Pattern.compile(".*\\d{5}.*");
Matcher m = p.matcher(str);

System.out.println(m.matches());
}
}
 
H

Hendrik Maryns

(e-mail address removed) schreef:
Hi,
I am trying to find a regular expression for 5 consequent numbers like
a zipcode anywhere in a string. For example, the regular expression
should match the following cause they contain 5 consequent numbers
somewhere:

aaa, bbbb, 46546
34344
aaaaaaaaa 67674
ajksdhajks 67675 asdhjkasd

but it shouldn't match the following:

asdjkas hdjasdhsajkd
7674367346
33
asdjkas 343

I tried using the following but it didn't work
myString.matches( "\\d{5}"))

Can somebody help me please

String.matches() looks whether the entire string matches the given
pattern. So you have to account for the characters before and after the
five digits as well, as Knute pointed out. Another way would be to use
java.util.regex.Matcher.find(), which finds the next match of a given
regex in a string. This is probably what you want.

H.
--
Hendrik Maryns
http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFHV8ege+7xMGD3itQRAsRTAJ0U1Byq7NYxhyI7zI4uTJ5DgLwyIQCcCi9S
E9iUnmRfasZe5m4lt41Pjv0=
=la/k
-----END PGP SIGNATURE-----
 
B

bugbear

Hendrik said:
(e-mail address removed) schreef:

String.matches() looks whether the entire string matches the given
pattern. So you have to account for the characters before and after the
five digits as well, as Knute pointed out. Another way would be to use
java.util.regex.Matcher.find(), which finds the next match of a given
regex in a string. This is probably what you want.

I suspect it's more efficient too. In general
I'd expect smaller regexps to be cheaper

BugBear
 
K

Knute Johnson

Hendrik said:
(e-mail address removed) schreef:

String.matches() looks whether the entire string matches the given
pattern. So you have to account for the characters before and after the
five digits as well, as Knute pointed out. Another way would be to use
java.util.regex.Matcher.find(), which finds the next match of a given
regex in a string. This is probably what you want.

H.

What Hendrik said and Matcher.find() is also handy if you want to find
multiple matching subsequences.

import java.util.regex.*;

public class test2 {
public static void main(String[] args) {
String str = ";alsdlkajsdf 32344 lkls3 3323 llsdfsdf 12345";
Pattern p = Pattern.compile("\\d{5}");
Matcher m = p.matcher(str);

while (m.find())
System.out.println(m.group());
}
}
 
T

Tim Smith

but it shouldn't match the following:

asdjkas hdjasdhsajkd
7674367346
33
asdjkas 343
....
import java.util.regex.*;

public class test2 {
public static void main(String[] args) {
String str = ";alsdlkajsdf 32344 lkls3 3323 llsdfsdf";
Pattern p = Pattern.compile(".*\\d{5}.*");
Matcher m = p.matcher(str);

System.out.println(m.matches());
}
}

Nope. That will match strings that contain runs of 5 or more digits.
He wants runs of exactly 5 digits.
 
T

Tim Smith

Pattern p = Pattern.compile(".*\\d{5}.*");
Matcher m = p.matcher(str);

Here's a way to fix that so it won't incorrectly match "foo123456bar"
and similar strings:

Pattern p = Pattern.compile(".*[^0-9]\\d{5}[^0-9].*");
Matcher m = p.matcher("a"+str+"a");

The "a"+str+"a" ensures that we do not have to worry about the special
cases of the 5 digits being the first or last digits of the string, and
the change to the regular expression ensures that the 5 digits are a run
of exactly 5 digits, by requiring that they be bordered by a non-digit.
 
L

Lew

Tim said:
Pattern p = Pattern.compile(".*\\d{5}.*");
Matcher m = p.matcher(str);

Here's a way to fix that so it won't incorrectly match "foo123456bar"
and similar strings:

Pattern p = Pattern.compile(".*[^0-9]\\d{5}[^0-9].*");
Matcher m = p.matcher("a"+str+"a");

The "a"+str+"a" ensures that we do not have to worry about the special
cases of the 5 digits being the first or last digits of the string, and
the change to the regular expression ensures that the 5 digits are a run
of exactly 5 digits, by requiring that they be bordered by a non-digit.

Or they could use find() instead of matches() and avoid all that String
manipulation overhead.
 
S

Stefan Ram

Tim Smith said:
Nope. That will match strings that contain runs of 5 or more digits.
He wants runs of exactly 5 digits.

Wouldn't it be smart if the OP had posted a compilable
test program with his own implementation which still
fails the test?

Then anyone who believes he has a solution could use the
test program of the OP to test his solution before replying.
 
S

Stefan Ram

Tim Smith said:
Pattern p = Pattern.compile(".*[^0-9]\\d{5}[^0-9].*");

Will this match the text "34344" (from the OP)
with just five digits and nothing else?
 
L

Lasse Reichstein Nielsen

Tim Smith said:
Pattern p = Pattern.compile(".*[^0-9]\\d{5}[^0-9].*");

Will this match the text "34344" (from the OP)
with just five digits and nothing else?

No, which is why the text is modified before the pattern is applied.

But a more complex pattern could do it directly:
"(^|.*[^\\d])\\d{5}([^\\d].*|$)"

/L
 
S

Stefan Ram

Lasse Reichstein Nielsen said:
No, which is why the text is modified before the pattern is applied.

Sorry! I had not read your previous post carefully enough.
 
L

Lasse Reichstein Nielsen

Sorry! I had not read your previous post carefully enough.

Oh, it wasn't mine. I was just about to make a similar response when
I did read the rest :)

/L
 
T

Tim Smith

But a more complex pattern could do it directly:
"(^|.*[^\\d])\\d{5}([^\\d].*|$)"

I tried something like that, but must have got it wrong. (99.9% of my
regular expression work is done in Perl, and so I probably botched
something translating to Java).

It would be interesting to see if it is faster to have the more complex
pattern, or use the simple pattern and add the guards to the string.
 
J

Joshua Cranmer

Lasse said:
But a more complex pattern could do it directly:
"(^|.*[^\\d])\\d{5}([^\\d].*|$)"

Equivalent but simpler:
"(^|.*\\D)\\d{5}(\\D.*|$)"

Said regex could be made to match only the zipcode by making the first
set of parentheses a negative-lookbehind and the second one a
negative-lookahead.

The regex I would prefer:
"\\b\\d{5}\\b" (does not match `a12345' though)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,996
Messages
2,570,238
Members
46,826
Latest member
robinsontor

Latest Threads

Top