LWP and UTF-8 encoding

P

peter.hrobar

I have a short perl script that among other things tries to open web
pages using LWP::UserAgent.

here is an excerpt from the code:

my $ua = LWP::UserAgent->new;

$ua->default_headers->push_header('Accept-Charset' =>
"iso-8859-1,iso-8859-2,utf-8");
$ua->default_headers->push_header('Accept' => "text/html, text/plain,
image/*");

$req = new HTTP::Request('GET', $url);
$res = $ua->request($req);

I'm using $res->content to get the content of the retrieved page.
Sometimes the script produces the following warnings with some Web
sites. I don't know why i'm getting this kind of messages.

Parsing of undecoded UTF-8 will give garbage when decoding entities at
/usr/lib/perl5/vendor_perl/5.8.6/LWP/Protocol.pm line 114.
Parsing of undecoded UTF-8 will give garbage when decoding entities at
/usr/lib/perl5/site_perl/5.8.6/i386-linux-thread-multi/HTML/PullParser.pm
line 83.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,152
Members
46,697
Latest member
AugustNabo

Latest Threads

Top