F
felix.ostmann
It is realy strange!
first the code:
#####################################
#!/usr/bin/perl
use strict;
use warnings;
use locale; ## WONT WORK
use Encode;
my $content = Encode::decode("iso-8859-15","[[lmo:Met\xE0j
alcal\xEDtt]]\n");
$content =~ s!^\[\[[a-z]{2}:.*\]\]$!!gm; ## WONT WORK
# $content =~ s!^\[\[[a-z]{2}:.*\]$!!gm; ## WORK
print $content;
#####################################
This bug? shocked me when i was parsing wikipedia-data.
after 69 articles my importprocess stops ... but he use many cpu-time
.... strange.
after some hours i found out that this small code can reproduce the
error. he cant execute the pattern.
without "use locale;" it works. With the second regexp ist works!
(search for only one \] at the end of the line).
I cant believe ... i think he must find out after "[[lmo:" that the
string dont match, why is the \]\] at the end so basic?
Why this affect only after "use locale;"? i set the locale to
POSIX,C,en_GB or de_DE, nothing wont work!
What is to do?
first the code:
#####################################
#!/usr/bin/perl
use strict;
use warnings;
use locale; ## WONT WORK
use Encode;
my $content = Encode::decode("iso-8859-15","[[lmo:Met\xE0j
alcal\xEDtt]]\n");
$content =~ s!^\[\[[a-z]{2}:.*\]\]$!!gm; ## WONT WORK
# $content =~ s!^\[\[[a-z]{2}:.*\]$!!gm; ## WORK
print $content;
#####################################
This bug? shocked me when i was parsing wikipedia-data.
after 69 articles my importprocess stops ... but he use many cpu-time
.... strange.
after some hours i found out that this small code can reproduce the
error. he cant execute the pattern.
without "use locale;" it works. With the second regexp ist works!
(search for only one \] at the end of the line).
I cant believe ... i think he must find out after "[[lmo:" that the
string dont match, why is the \]\] at the end so basic?
Why this affect only after "use locale;"? i set the locale to
POSIX,C,en_GB or de_DE, nothing wont work!
What is to do?