Perl v5.6.0 is not compatible with v5.8.0??

J

JZ

I have a strange problem with Perl. When I execute indexer.pl script
(from ksearch http://www.kscripts.com/scripts.shtml) with Perl 5.6.0
/ RedHat 7 it works fine. But if I execute the same script on another
computer working with Perl 5.8.0 / RedHat 9, I receive lots errors
like that:

Malformed UTF-8 character (overflow at 0x2ca77a73), byte 0x79, after
start byte 0xbf) in substitution (s///) at ./indexer.pl line 353

Line 353 from indexer.pl is:

$contents =~
s/(<\s*script[^>]*>.*?<\s*\/script\s*>)|(<\s*style[^>]*>.*?<\s*\/style\s*>)/
/gsi;

The same errors are for the latest version of ksearch 1.4 I have just
installed. So I suppose the problem is with Perl 5.8.0. (I have no
UTF-8 encoded files. All HTML files are ISO-8859-2 encoded)
 
J

Juha Laiho

JZ said:
I have a strange problem with Perl. When I execute indexer.pl script
(from ksearch http://www.kscripts.com/scripts.shtml) with Perl 5.6.0
/ RedHat 7 it works fine. But if I execute the same script on another
computer working with Perl 5.8.0 / RedHat 9, I receive lots errors
like that:

Malformed UTF-8 character (overflow at 0x2ca77a73), byte 0x79, after
start byte 0xbf) in substitution (s///) at ./indexer.pl line 353

Hmm.. I think the default installation on RH9 set the system default
character set to UTF-8 -- and could be that you're not overriding
it when running your indexer - so the indexer is following that
default.
 
B

Ben Morrow

JZ said:
I have a strange problem with Perl. When I execute indexer.pl script
(from ksearch http://www.kscripts.com/scripts.shtml) with Perl 5.6.0
/ RedHat 7 it works fine. But if I execute the same script on another
computer working with Perl 5.8.0 / RedHat 9, I receive lots errors
like that:

Malformed UTF-8 character (overflow at 0x2ca77a73), byte 0x79, after
start byte 0xbf) in substitution (s///) at ./indexer.pl line 353

RH9 has a UTF8 locale set by default. Perl 5.8.0 takes notice of that,
and expects all your files to bu UTF8, by default.

So, there are three solutions:

1. Change your default locale, given you don't use utf8.

2. Upgrade to >5.8.1, which has this default behaviour removed as it
caused lots of problems.

3. Add C< use open ':encoding(iso8859-2)' > at the top of your script.

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top