XML::DOM Encoding UTF-8 and ISO-8859-1

A

Addy

I'm a little confused as to why I'm getting these results. Consider
the XML file:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<foo>
<string>Sécurité</string>
</foo>

Through a CGI script, I load up the file, grab the encoding and put in
the CGI header:

my ($parser) = new XML::DOM::parser();
my ($doc) = $parser->parsefile('foo.xml');
my ($encoding) = $doc->getXMLDecl()->getEncoding();
print header(-charset => $encoding);

However, when I traverse through the XML and print out the above
"string" element, I see grabled text like "Sécurité"

If I change the CGI header encoding to UTF-8 like such:

print header(-charset => 'UTF8');

The text shows up properly. It would seem to me that the text would
show up properly by using the same encoding on the HTML page as is in
the XML file. This is the case when using other encodings, namely
'x-sjis-cp932'.

Could someome help me understand what I'm overlooking?

Thank you,
Addy
 
A

Alan J. Flavell

However, when I traverse through the XML and print out the above
"string" element, I see grabled text like "Sécurité"

My hunch is that you're using Perl 5.8.0 under RedHat9 - or some
similar environment where the locale setting implies utf-8.

A search for those key points in previous discussions should get
you on the way to understanding the problem.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,742
Latest member
AshliMayer

Latest Threads

Top