Data::Dumper vs. UTF-8, as usual

J

jidanni

Gentlemen, I need to use
use utf8;
use open qw/:std :encoding(utf8)/;
in my program, but it has the side effect of causing
print Dumper "é¾”";
to print
$VAR1 = "\x{9f94}";
instead of
$VAR1 = "é¾”";
like it would otherwise. I dare not touch the 'use' stuff, so how can I
tweak this?:
use strict;
use warnings FATAL => 'all';
use open qw/:std :encoding(utf8)/;
use utf8;
use Data::Dumper;
print Dumper "é¾”";
 
I

Ilya Zakharevich

Gentlemen, I need to use
use utf8;
use open qw/:std :encoding(utf8)/;
in my program, but it has the side effect of causing
print Dumper "?";
to print
$VAR1 = "\x{9f94}";
instead of
$VAR1 = "?";
like it would otherwise. I dare not touch the 'use' stuff, so how can I
tweak this?:
use strict;
use warnings FATAL => 'all';
use open qw/:std :encoding(utf8)/;
use utf8;
use Data::Dumper;
print Dumper "?";

What are you using for editing your files? Are you sure you use a
real question mark? Check with
od -tx1a -Ax your_script.pl

I see no problem here with 5.8.8,
Ilya
 
P

Peter J. Holzer

What are you using for editing your files? Are you sure you use a
real question mark?

The only question mark in jidanni's posting was at the end of "how can I
tweak this?". The character jidanni wants to be displayed is a CJK
character: http://unicode.org/cgi-bin/GetUnihanData.pl?codepoint=9F94
I see no problem here with 5.8.8,

I see a problem with your newsreader ;-).


Unfortunately I don't know a solution for the OP's problem. This may be
a case where writing a custom dumping routine (and uploading it to CPAN)
may be worthwhile.

hp
 
X

Xho Jingleheimerschmidt

jidanni said:
Gentlemen, I need to use
use utf8;
use open qw/:std :encoding(utf8)/;
in my program, but it has the side effect of causing
print Dumper "é¾”";

without utf8, Perl is interpreting that character as just 3 bytes,
printing those three bytes, and it is your terminal that is converting
those back into the character that you see. If you were to print the
length, rather than the output of Dumper, you would see the difference
that "use utf8" makes.

As far as I can tell, the "use open" part makes no difference, other
than to silence a warning about wide characters.
to print
$VAR1 = "\x{9f94}";
instead of
$VAR1 = "é¾”";
like it would otherwise. I dare not touch the 'use' stuff, so how can I
tweak this?:

Without writing your own version of Data::Dumper (or extending/fixing
the current one), or doing something basically equivalent, I don't see
how you can. However, my version of Data::Dumper is rather old, maybe
it has been already tweaked in the mean time. It could use something
like $Data::Dumper::Useutf8.

Xho
 
P

Peter J. Holzer

Gentlemen, I need to use
use utf8;
use open qw/:std :encoding(utf8)/;
in my program, but it has the side effect of causing
print Dumper "?"; [? was a Chinese character in the OP]
to print
$VAR1 = "\x{9f94}";
instead of
$VAR1 = "?";
like it would otherwise. I dare not touch the 'use' stuff, so how can I
tweak this?:
[...]
Unfortunately I don't know a solution for the OP's problem. This may be
a case where writing a custom dumping routine (and uploading it to CPAN)
may be worthwhile.

Forgot to add: It also depends very much on what Data::Dumper is used
for in the OP's script: Is the output supposed to be readable by humans
or by other programs? Is the output only used for debugging purposes or
is the part of the "real" output of the program?

hp
 
I

Ilya Zakharevich

The only question mark in jidanni's posting was at the end of "how can I
tweak this?". The character jidanni wants to be displayed is a CJK
character: http://unicode.org/cgi-bin/GetUnihanData.pl?codepoint=9F94


I see a problem with your newsreader ;-).

I do not see any problem with it. It is told that the TTY understands
latin-1, and performs accordingly. The real problem is with wetware -
I could have guessed that this question mark is not \x3f...

Thanks,
Ilya
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,225
Members
46,815
Latest member
treekmostly22

Latest Threads

Top