J
jb
I've been working with Patterns for a while, and following thing
baffled: \w class doesnt seem to include non ascii letters (well at
least not polish ones ).
Javadoc seems to say nothing about it.
Heres the test:
import java.util.regex.*;
class rTest{
public static void main(String[] args){
System.out.println("Regexp: '\\w+'");
Pattern pat;
Matcher m;
pat = Pattern.compile("\\w+");
m= pat.matcher("a");
System.out.println("Matches 'a' " + m.matches());
m= pat.matcher("\u015b");
System.out.println("Matches '\u015b' " + m.matches());
m= pat.matcher("a");
System.out.println("Matches 'a' " + m.matches());
m= pat.matcher("¶");
System.out.println("Matches '¶' " + m.matches());
}
}
It prints (on my system):
Regexp: '\w+'
Matches 'a' true
Matches '¶' false
Matches 'a' true
Matches '¶' false
The question is: whether it is buggy behaviour or is it according to
specs, and is there any way to include all (polish) letters in a class
in an elegant way?
baffled: \w class doesnt seem to include non ascii letters (well at
least not polish ones ).
Javadoc seems to say nothing about it.
Heres the test:
import java.util.regex.*;
class rTest{
public static void main(String[] args){
System.out.println("Regexp: '\\w+'");
Pattern pat;
Matcher m;
pat = Pattern.compile("\\w+");
m= pat.matcher("a");
System.out.println("Matches 'a' " + m.matches());
m= pat.matcher("\u015b");
System.out.println("Matches '\u015b' " + m.matches());
m= pat.matcher("a");
System.out.println("Matches 'a' " + m.matches());
m= pat.matcher("¶");
System.out.println("Matches '¶' " + m.matches());
}
}
It prints (on my system):
Regexp: '\w+'
Matches 'a' true
Matches '¶' false
Matches 'a' true
Matches '¶' false
The question is: whether it is buggy behaviour or is it according to
specs, and is there any way to include all (polish) letters in a class
in an elegant way?