Regex Puzzle

R

Roedy Green

Why won't this regex match this string?


<blockquote>&ldquo;The secret is surrender. Commitment to Christ
involves surrender of the
intellect, the emotions and the will &mdash; the total
person.&rdquo;
<br>
~ <span class="christian">Bill Bright</span>, <cite>Jesus and the
Intellectual</cite></blockquote>



private static final Pattern CHRISTIAN = Pattern.compile(
"<blockquote>[.]+<span
class=\"christian\">[.]+</span>[.]+</blockquote>", Pattern.DOTALL );

I have stared at it for 30 minutes and tried many variations to no
avail.

Any tips on debugging problems of this sort?

--
Roedy Green Canadian Mind Products
http://mindprod.com
PM Steven Harper is fixated on the costs of implementing Kyoto, estimated as high as 1% of GDP.
However, he refuses to consider the costs of not implementing Kyoto which the
famous economist Nicholas Stern estimated at 5 to 20% of GDP
 
R

Roedy Green

private static final Pattern CHRISTIAN = Pattern.compile(
"<blockquote>[.]+<span
class=\"christian\">[.]+</span>[.]+</blockquote>", Pattern.DOTALL );

use plain . not [.] [.] means literal period, not any-char.
--
Roedy Green Canadian Mind Products
http://mindprod.com
PM Steven Harper is fixated on the costs of implementing Kyoto, estimated as high as 1% of GDP.
However, he refuses to consider the costs of not implementing Kyoto which the
famous economist Nicholas Stern estimated at 5 to 20% of GDP
 
R

Robert Klemme

Why won't this regex match this string?
I have stared at it for 30 minutes and tried many variations to no
avail.

As I see you finally figured yourself.
Any tips on debugging problems of this sort?

Cut down the regexp, start building it piece by piece and match it
against your text sequence until it stops matching. The last addition
is the "culprit".

Cheers

robert
 
K

Kim A. Brandt

Roedy said:
Why won't this regex match this string?


<blockquote>&ldquo;The secret is surrender. Commitment to Christ
involves surrender of the
intellect, the emotions and the will &mdash; the total
person.&rdquo;
<br>
~ <span class="christian">Bill Bright</span>, <cite>Jesus and the
Intellectual</cite></blockquote>



private static final Pattern CHRISTIAN = Pattern.compile(
"<blockquote>[.]+<span
class=\"christian\">[.]+</span>[.]+</blockquote>", Pattern.DOTALL );

I have stared at it for 30 minutes and tried many variations to no
avail.

Any tips on debugging problems of this sort?

Try the reluctant quantifier[1,2] `X*?' like so:

Pattern p = Pattern.compile("<blockquote>.*?<span\\s+class=\"christian\">.*?<\\/span>.*?<\\/blockquote>", Pattern.DOTALL);



[1] http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html
[2] http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html
 
J

Jan Thomä

Any tips on debugging problems of this sort?

If you are using IntellIJ there is a nice regex plugin which helps a
whole lot in creating complex regular expressions.

Jan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Debugging regex 3
case sensitive filenames 62
almost equal strings 20
Browser news 4
Constellations 38
Smoothing 2

Members online

Forum statistics

Threads
473,968
Messages
2,570,150
Members
46,697
Latest member
AugustNabo

Latest Threads

Top