java.util.regex question

C

Calum MacLean

I have the following code:

Pattern pattern = Pattern.compile("1.*3");
Matcher matcher = pattern.matcher("1234567890123");
matcher.find();
System.out.println(matcher.start() + " --> " + matcher.end());

I would expect this to match just the first 3 characters. Instead, it
matches all the characters. The following is the output:

0 --> 13

Is there any way I could get it to match just the 3 characters? Or
does regex not work this way?

Thanks,
Calum
 
J

J. Chris Tilton

Calum MacLean said:
I have the following code:

Pattern pattern = Pattern.compile("1.*3");
Matcher matcher = pattern.matcher("1234567890123");
matcher.find();
System.out.println(matcher.start() + " --> " + matcher.end());

I would expect this to match just the first 3 characters.

No this Pattern is looking for a 1, followed by any character zero or more
times, then a 3.
The "." is a wildcard for any character. The "*" means zero or more times.
If you want to match any three characters at the beginning of the string you
could create this pattern ("^.{3}").
If my memore is right the caret symbol means that the string must be at the
begging of the expression. "." will mean any character. And the {3} will
tell it to match exactly 3 characters.



Instead, it
 
J

John C. Bollinger

Calum said:
I have the following code:

Pattern pattern = Pattern.compile("1.*3");
Matcher matcher = pattern.matcher("1234567890123");
matcher.find();
System.out.println(matcher.start() + " --> " + matcher.end());

I would expect this to match just the first 3 characters.

Then your expectation is not well-founded. Evidently you are not very
familiar with typical regex implementations, therefore you should
carefully read the API docs for the Pattern class, and perhaps seek
tutorial material on the subject.
Instead, it
matches all the characters. The following is the output:

0 --> 13

As it should be. "*" is a "greedy" quantifier, which means that it will
match the most characters possible consistent with the whole pattern
matching.
Is there any way I could get it to match just the 3 characters? Or
does regex not work this way?

You could use a different quantifier. A "reluctant" quantifier matches
the fewest characters possible consistent with the whole pattern
matching. In this case it looks like what you want would be:

Pattern pattern = Pattern.compile("1.*?3");

The "*?" is the reluctant version of "*".


John Bollinger
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Embarrassing regex question 5
complex regex 1
complex regex 1
RegExp Group Headache 1
Question on regular expression. 6
Hostname verifier 0
Certificate validation 1
Java regexp help 2

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top