Luke said:
Does anybody know how to match character classes in i18n mode? E.g. ü
(Unicode 00FC) (a german umlaut) should actually be matched by the
pattern "[a-z]", it does not.
Of course not, because [a-z] means all characters between (including)
'a' and 'z'. And ü is not part of that - 'A', either by the way.
See
http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html#sum
for a list of available patterns, but I think a
\p{javaLowerCase} should do the trick to match all unicode-
characters. If you simply want to check for umlauts you have to
add them "manually" with [a-zäöüß]
If you want to check for specific Unicode-Blocks you can do
that by e.g. \p{InGreek} for the Greek-Block. See the linked
API-description above for more informations.
Regards, Lothar
--
Lothar Kimmeringer E-Mail: (e-mail address removed)
PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)
Always remember: The answer is forty-two, there can only be wrong
questions!