perl print behavior different on window/unix

J

jbilla2004

Hi all,

I'm getting some strange behavior with print in perl - it seems to
mangle the strings on linux while printing (the strings are in
iso-8859-15). Running the same script on WinXP (under cygwin) yields
the expected behavior (not to mangle!). Any ideas? One can use the
octal representation to make the linux version on linux appear ok (look
at the décima string below) but it appears corrupted on windows
(whereas the windows created file looks fine on windows & linux)

Appreciate any comments.

thanks!!
------------------------------
Script:

#!/usr/bin/perl -w

%NAME_ES = (
"trlr" => "tr\x{00e1}iler",#"tráiler",
"6th" => "sexta",
"7th" => "s\x{00e9}ptima",#"séptima",
"10tha" => "d\x{00e9}cima",#"décima"),
"10th" => "décima",
"15th" => "quince",
);

open( TST, "> test.dat" ) || die "Unable to open test.dat";

foreach $k ( sort keys %NAME_ES ) {
print TST "Key: $k\tValue: $NAME_ES{$k}\n";
print "Key: $k\tValue: $NAME_ES{$k}\n";
}
close(TST);

--------
Output (linux) opened using xemacs on linux:

Key: 10th Value: décima
Key: 10tha Value: décima
Key: 15th Value: quince
Key: 6th Value: sexta
Key: 7th Value: séptima
Key: trlr Value: tráiler

-----
output (linux) opened using xemacs on windows (ftped using bin mode)

Key: 10th Value: décima
Key: 10tha Value: décima
Key: 15th Value: quince
Key: 6th Value: sexta
Key: 7th Value: séptima
Key: trlr Value: tráiler

-----
output (windows) opened using xemacs on windows

Key: 10th Value: décima
Key: 10tha Value: décima
Key: 15th Value: quince
Key: 6th Value: sexta
Key: 7th Value: séptima
Key: trlr Value: tráiler

-----
output (windows) opened using xemacs on linux (ftped using bin mode)

Key: 10th Value: décima
Key: 10tha Value: décima
Key: 15th Value: quince
Key: 6th Value: sexta
Key: 7th Value: séptima
Key: trlr Value: tráiler


---
perl -V (linux)

Summary of my perl5 (revision 5.0 version 8 subversion 0)
configuration:
Platform:
osname=linux, osvers=2.4.21-37.el.centos3.xfs.0smp,
archname=i386-linux-thread-multi
uname='linux sillage.bis.pasteur.fr 2.4.21-37.el.centos3.xfs.0smp
#1 smp tue oct 4 11:56:53 cest 2005 i686 athlon i386 gnulinux '
config_args='-des -Doptimize=-O2 -g -pipe -march=i386 -mcpu=i686
-Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red
Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux
-Dvendorprefix=/usr -Dsiteprefix=/usr
-Dotherlibdirs=/usr/lib/perl5/5.8.0 -Duseshrplib -Dusethreads
-Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db
-Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio
-Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less
-isr'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-DDEBUGGING -fno-strict-aliasing -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
optimize='-O2 -g -pipe -march=i386 -mcpu=i686',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-DDEBUGGING -fno-strict-aliasing -I/usr/local/include
-I/usr/include/gdbm'
ccversion='', gccversion='3.2.3 20030502 (Red Hat Linux 3.2.3-53)',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='gcc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt -lutil
perllibs=-lnsl -ldl -lm -lpthread -lc -lcrypt -lutil
libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version='2.3.2'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic
-Wl,-rpath,/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE'
cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl):
Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS
USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
Locally applied patches:
MAINT18379
Built under linux
Compiled at Dec 20 2005 14:22:45
@INC:
/usr/lib/perl5/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/5.8.0
/usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.0
/usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.0
/usr/lib/perl5/vendor_perl
/usr/lib/perl5/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/5.8.0
 
S

Sherm Pendley

I'm getting some strange behavior with print in perl - it seems to
mangle the strings on linux while printing (the strings are in
iso-8859-15). Running the same script on WinXP (under cygwin) yields
the expected behavior (not to mangle!). Any ideas?
open( TST, "> test.dat" ) || die "Unable to open test.dat";

Try using a PerlIO layer to specify the encoding in your output file:

open (TST, '>:encoding(iso-8859-15)', 'test.dat')
or die "Unable to open test.dat: $!";

Without a specified encoding, you get the default encoding, which may be
different on your Linux and Windows systems.

"perldoc PerlIO" for details.

sherm--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top