F
Francesco Moi
Hi.
I want to parse these XML contents:
http://news.search.yahoo.com/news/rss?va=linux (This is a RSS file)
I tried with:
----------------
use LWP::Simple qw($ua get);
use LWP::Simple qw($ua head);
use HTML::TokeParser;
use LWP::UserAgent;
my $Url = "http://news.search.yahoo.com/news/rss?va=linux";
my $content = get($Url);
$parser=HTML::TokeParser->new(\$content);
while (my $token = $parser->get_token) {
my $tag_type = shift @{ $token };
if ($tag_type eq 'S') {
my($tag, $attr, $attrseq, $rawtxt) = @{ $token };
if ($tag eq 'title'){$title =
$parser->get_trimmed_text("/title");}
if ($tag eq 'link'){$link = $parser->get_trimmed_text("/link");}
if ($tag eq 'description'){
$description = $parser->get_trimmed_text("/description");
print "$title - $link - $description\n\n";}}}
------------
But I get this information:
---------
<![CDATA[Foo_Title]]> - Foo_Url -
--------
"<![CDATA" appears (no idea about its meaning) and no data about
description.
However if I substitute
"http://news.search.yahoo.com/news/rss?va=linux" with
"http://www.boingboing.net/index.rdf", it works OK.
Whay am I doing wrong? Regards.
I want to parse these XML contents:
http://news.search.yahoo.com/news/rss?va=linux (This is a RSS file)
I tried with:
----------------
use LWP::Simple qw($ua get);
use LWP::Simple qw($ua head);
use HTML::TokeParser;
use LWP::UserAgent;
my $Url = "http://news.search.yahoo.com/news/rss?va=linux";
my $content = get($Url);
$parser=HTML::TokeParser->new(\$content);
while (my $token = $parser->get_token) {
my $tag_type = shift @{ $token };
if ($tag_type eq 'S') {
my($tag, $attr, $attrseq, $rawtxt) = @{ $token };
if ($tag eq 'title'){$title =
$parser->get_trimmed_text("/title");}
if ($tag eq 'link'){$link = $parser->get_trimmed_text("/link");}
if ($tag eq 'description'){
$description = $parser->get_trimmed_text("/description");
print "$title - $link - $description\n\n";}}}
------------
But I get this information:
---------
<![CDATA[Foo_Title]]> - Foo_Url -
--------
"<![CDATA" appears (no idea about its meaning) and no data about
description.
However if I substitute
"http://news.search.yahoo.com/news/rss?va=linux" with
"http://www.boingboing.net/index.rdf", it works OK.
Whay am I doing wrong? Regards.