Upper/lowercase regex matching in unicode

Jason Stitt · Oct 19, 2005

What's the best way to match uppercase or lowercase characters with a
regular expression in a unicode-aware way? Obviously [A-Z] and [a-z]
aren't going to cut it. I thought there were character classes of the
form ::upper:: or similar syntax, but can't find them in the docs.
Maybe I'm getting it mixed up with Perl regexen.

The upper() and lower() methods do work on accented characters in a
unicode string, so there has to be some recognition of unicode case
in there somewhere.

Thanks,

Jason

George Sakkis · Oct 20, 2005

Jason Stitt said:
What's the best way to match uppercase or lowercase characters with a
regular expression in a unicode-aware way? Obviously [A-Z] and [a-z]
aren't going to cut it. I thought there were character classes of the
form ::upper:: or similar syntax, but can't find them in the docs.
Maybe I'm getting it mixed up with Perl regexen.

The upper() and lower() methods do work on accented characters in a
unicode string, so there has to be some recognition of unicode case
in there somewhere.

Thanks,

Jason

http://tinyurl.com/7jqgt

George

Unicode: matching a	0	Nov 15, 2007
Python Unicode handling wins again -- mostly	67	Nov 30, 2013
Unicode: matching a word and unaccenting characters	2	Nov 15, 2007
Finding Upper-case characters in regexps, unicode friendly.	4	May 24, 2006
Flexible string representation, unicode, typography, ...	94	Aug 23, 2012
help with regex matching multiple %e	0	Mar 3, 2011
Regex Matching on Readline()	3	Dec 20, 2007
Help a beginner - simple lowercase to uppercase and so on function	63	Jul 26, 2009

Upper/lowercase regex matching in unicode

Jason Stitt

George Sakkis

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads