B
Ben Bullock
I've found a place where Perl seems to behave differently depending on
whether something is marked as UTF-8 or not, regardless of the fact that
it is just ASCII.
In the following code snippet,
#!/usr/local/bin/perl -lw
use strict;
use Encode 'decode';
use Lingua::JA::FindDates 'subsjdate';
binmode STDERR,"utf8";
binmode STDOUT,"utf8";
print STDERR "first try\n";
my $test = "ABCDEFG";
print subsjdate($test);
print STDERR "now try again\n";
$test = decode ('utf8', $test);
print subsjdate($test);
the output is like this:
ben ~ 541 $ ./test2.pl
first try
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
ABCDEFG
now try again
ABCDEFG
ben ~ 542 $
But, if I
use utf8;
and call the routine with a non-ascii string, like å¹³æˆ, I don't get the
error messages.
What's more, after about one hour of exhaustive checking, I'm fairly sure
that there is no uninitialized value in the pattern match in question. In
fact I can remove the error message by removing a variable which is
initialized, called $kanjidigits, from the pattern match, but that seems
even more weird.
I think the above-described behaviour, regardless of any errors in the
module, indicates an error in Perl. Also, I think there is nothing wrong
with the module. Does anybody have any other opinions?
whether something is marked as UTF-8 or not, regardless of the fact that
it is just ASCII.
In the following code snippet,
#!/usr/local/bin/perl -lw
use strict;
use Encode 'decode';
use Lingua::JA::FindDates 'subsjdate';
binmode STDERR,"utf8";
binmode STDOUT,"utf8";
print STDERR "first try\n";
my $test = "ABCDEFG";
print subsjdate($test);
print STDERR "now try again\n";
$test = decode ('utf8', $test);
print subsjdate($test);
the output is like this:
ben ~ 541 $ ./test2.pl
first try
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
Use of uninitialized value in pattern match (m//) at /usr/local/lib/perl5/
site_perl/5.10.0/Lingua/JA/FindDates.pm line 531.
ABCDEFG
now try again
ABCDEFG
ben ~ 542 $
But, if I
use utf8;
and call the routine with a non-ascii string, like å¹³æˆ, I don't get the
error messages.
What's more, after about one hour of exhaustive checking, I'm fairly sure
that there is no uninitialized value in the pattern match in question. In
fact I can remove the error message by removing a variable which is
initialized, called $kanjidigits, from the pattern match, but that seems
even more weird.
I think the above-described behaviour, regardless of any errors in the
module, indicates an error in Perl. Also, I think there is nothing wrong
with the module. Does anybody have any other opinions?