Regex is correct but java won't parse it ?

N

News

Hello all,

I want to create a logic class to evaluate simple logical epxressions and
print their truth table. I am using a regular epxression that looks for a
pattern commencing with a char and followed by any number of (operator char)
groups, (for the sake of simplicity only the AND operator "&" is included
till I get it working properly).

My regex is [a-b]([&][a-b])*. I know the regex is correct because I have
tested it using the regular expression demo from
www.regular-expressions.info .

Following is my code stripped to the essentials. As it stands this returns a
match for even misformed strings and I cannot see why !

import java.util.regex.*;
public class Logic {
public static void main(String[] args) {
StringBuffer strb = new StringBuffer();
for (int i = 0; i < args.length; i++) {
strb.append(args); //Add the command line arguments to String Buffer }
String str = strb.toString(); //Change to a string so Matcher can use it.
String regex = new String("[a-z]([&][a-z])*");
System.out.println(str); //Test print to ensure the string and regex are
correct
System.out.println(regex);
Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE |
Pattern.UNICODE_CASE);
Matcher m = p.matcher(regex);
if (m.find()) {
System.out.println("Matched"); }
else {
System.out.println("Not Matched"); }
}
}

Any ideas ? Thanks in advance !
 
J

Joshua Cranmer

News said:
if (m.find()) {

find() returns if there exists a substring that matches the expression.
For example, you regex will match "3453457a4234456456" because there is
an 'a' in the expression. What you want is match().
 
J

Joshua Cranmer

Stefan said:
Possibly, you meant »matches()« - there seems to be no »match()« method
in the class »Matcher«:

http://download.java.net/jdk7/docs/api/java/util/regex/Matcher.html#matches()

Too much JavaScript for me, then.

Alternatively, using the regex "^[a-z]([&+*-][a-z])*$" with find would
also work, provided that the string is only one line line long.

Interestingly enough, from the URL you provided, you seem to be using
JDK 7. What's different from 1.6 (so far)?
 
E

Esmond Pitt

News said:
I want to create a logic class to evaluate simple logical epxressions
and print their truth table. I am using a regular epxression that looks
for a pattern commencing with a char and followed by any number of
(operator char) groups, (for the sake of simplicity only the AND
operator "&" is included till I get it working properly).

Hold on. The minute you get to handling "|" as well as "&" you will
discover that this is not a regular-expression problem, it is a parsing
problem. You will need to implement operator precedence, and REs can't
do that.
 
N

News

Hi Esmond, Joshua and Stefan,

Thanks for pointing out to me the difference between .find() and .matches().
It's a big step closer but .matches() returns false unless I replace the
regex with the EXACT string I am seaching for, eg "[a-z]([&][a-z])*" is
replaced with "p&q" and I search on "p&q". I also tried .LookingAt() but
still don't get a match. I alos tried using the escape sequence \\& in the
regex but no difference.

Esmond, I will certainly watch out for precedence issues once I get this
simple case working ! Thanks again. Herer is my latest.

import java.util.regex.*;
public class Logic {
public static void main(String[] args) {
StringBuffer strb = new StringBuffer();
for (int i = 0; i < args.length; i++) {
strb.append(args); //Add the command line arguments to String Buffer
}
String str = strb.toString(); //Change to a string so Matcher can use it.
String regex = new String("[a-z]([&][a-z])*");
System.out.println(str); //Test print to ensure the string and regex are
correct
System.out.println(regex);
Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE |
Pattern.UNICODE_CASE);
Matcher m = p.matcher(regex);
if (m.matches())
{
System.out.println("Matched");
}
else
{
System.out.println("Not Matched");
} } }

which when run with "p&q&n" produces

p&q&n
[a-z]([&][a-z])*
Not Matched

Any ideas? There's a beer in it!

Wayne
 
R

Roedy Green

import java.util.regex.*;
public class Logic {
public static void main(String[] args) {
StringBuffer strb = new StringBuffer();
for (int i = 0; i < args.length; i++) {
strb.append(args); //Add the command line arguments to String Buffer }
String str = strb.toString(); //Change to a string so Matcher can use it.
String regex = new String("[a-z]([&][a-z])*");
System.out.println(str); //Test print to ensure the string and regex are
correct
System.out.println(regex);
Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE |
Pattern.UNICODE_CASE);
Matcher m = p.matcher(regex);
if (m.find()) {
System.out.println("Matched"); }
else {
System.out.println("Not Matched"); }
}


I tidied and commented your code. In doing so the primary error jumped
out.

import java.util.regex.*;
public class Logic
{
public static void main(String[] args)
{
final StringBuffer strb = new StringBuffer();
for ( int i = 0; i < args.length; i++ )
{
strb.append(args); //Add the command line arguments to
StrinhBuffer
}
final String str = strb.toString(); //Change to a string so
Matcher can use it.
// look for string of the form ---a&a&b&c---
final String regex = "[a-z]([&][a-z])*";
System.out.println("command line:" + str); //Test print to
ensure the string and regex are correct
System.out.println("regex:" + regex);
final Pattern p = Pattern.compile(regex,
Pattern.CASE_INSENSITIVE |
Pattern.UNICODE_CASE);
// scan command string, not the regex.
final Matcher m = p.matcher(str);
if ( m.find() )
{
System.out.println("Matched");
// add some more printout to see what was matched.
final int gc = m.groupCount();
// group 0 is the whole pattern matched,
// loops runs from from 0 to gc, not 0 to gc-1 as is
traditional.
for ( int i=0; i<=gc; i++ )
{
System.out.println( i + " : " + m.group( i ) );
}
}
else
{
System.out.println("Not Matched");
}
}
}
 
A

Andrew Thompson

News wrote:
...
There is no ASCII symbol for smacking yourself in the forehead and kicking
the cat ...

Did the cat write the code?

If not, I suggest it more appropriate, if no less violent,
to kick the ..entity or being that wrote the code.
 
L

Lew

News said:

Andrew said:
Did the cat write the code?

If not, I suggest it more appropriate, if no less violent,
to kick the ..entity or being that wrote the code.

Maybe they meant "cat" in the beatnik sense, that is, they are going to kick
the "cat" that wrote it.
 
E

Esmond Pitt

News said:
Esmond, I will certainly watch out for precedence issues once I get this
simple case working !

Why would you bother to get it working when REs can't do it? You need to
build a tokenizer and a parser.
 
C

Chris Dollin

Wayne said:
Howdy Esmond,

The StringTokenizer documentation actually recommends regular expressions be
used instead ! See
http://java.sun.com/j2se/1.4.2/docs/api/java/util/StringTokenizer.html

Not a StringTokenizer; a tokeniser, aka lexer, aka lexical analyser, that
recognises tokens in the language, not just sequences separated by some
character.

If you're going to parse logical expressions, you will very soon go past
the stage where regular expressions can do the job, since you'll want
to tackle operators with different precedences, and brackets. It is
DEAD EASY to write a parser for simple expressions once you have the
tokens.

[You can use REs to recognise the tokens relatively easily.]
 
M

Martin Gregorie

Chris said:
Not a StringTokenizer; a tokeniser, aka lexer, aka lexical analyser, that
recognises tokens in the language, not just sequences separated by some
character.

If you're going to parse logical expressions, you will very soon go past
the stage where regular expressions can do the job, since you'll want
to tackle operators with different precedences, and brackets. It is
DEAD EASY to write a parser for simple expressions once you have the
tokens.
Its even easier to use something like Coco/R, which takes a single input
file and generates a Scanner (tokenizer) and a Parser class from it.
Even better, the frameworks for these classes are external text files,
so you can modify them. For instance, I needed a Scanner that would
accept a string to be processed - there was no constructor that would do
that but adding one was simple enough. As you'd hope, the Java version
of Coco/R is written in Java.
 
B

bsgama

in this line Matcher m = p.matcher(regex);, you shoud pass de str, not
the regex!



News escreveu:
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top