m//i behaves strange : variable does not match itself

P

peter pilsl

I use unicode and locales (de_AT.UTF-8) and - against warnings of
combing these - everything works fine (finally !!)
I can sort(), lc() and pattermatch but there is one very interesting
problem left with m//i : Characters with multibyte-representation only
match if the pattern is beheaded by something. I would not be suprised
if it never matches, but I am suprised that it only matches "sometimes"
and not even matches itself !!

example: (in german Ä ist the uppercase to ä)

Ä =~ m/Ä/i => no match !!
Ä =~ m/^Ä/i => match !!
Ä =~ m/^ä/i => match
bÄc =~ m/bä/i => match
bÄc =~ m/ä/i => no match

real source:

use locale;
$s="\x{e4}";
utf8::upgrade($s);
print $s=~/$s/i?"ok\n":"fail\n"

==> fail !!!


There is an easy workaround this, by calling lc() to the searchterm and
the pattern first and use m// without the i-flag then, but its an
interesting behaviour.

As I could see till now this phenomen does not depend on the special
locale used (de_AT.UTF-8, en_US, C ....) but occures as soon "use local"
is invoked.


peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,161
Messages
2,570,891
Members
47,423
Latest member
henerygril

Latest Threads

Top