B
Bontina Chen
Hi
I'm using hpricot to parse the following file.
<item
rdf:about="http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn">
<title>[from morwyn] * HTML for the Conceptually Challenged</title>
<link>http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn</link>
<description>HTML for the Conceptually Challenged. Very basic tutorial,
plainly worded for people who hate to read instructions.</description>
<dc:creator>morwyn</dc:creator>
<dc:date>2006-10-10T07:28:28Z</dc:date>
<dc:subject>html imported webpagedesign</dc:subject>
<taxo:topics>
<rdf:Bag>
<rdf:li resource="http://del.icio.us/tag/imported" />
<rdf:li resource="http://del.icio.us/tag/html" />
<rdf:li resource="http://del.icio.us/tag/webpagedesign" />
</rdf:Bag>
</taxo:topics>
</item>
I'm trying to get the content from <dc:subject> like this
doc = Hpricot.parse(File.read("965.xhtml"))
(doc/"item").each do |t|
puts (t/"dc:subject").innerTEXT
end
but I got
<dc:subject>html internet tutorial web</dc:subject>
while I only need "html internet tutorial web"
Anyone knows what's the right function to call?
THanks
I'm using hpricot to parse the following file.
<item
rdf:about="http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn">
<title>[from morwyn] * HTML for the Conceptually Challenged</title>
<link>http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn</link>
<description>HTML for the Conceptually Challenged. Very basic tutorial,
plainly worded for people who hate to read instructions.</description>
<dc:creator>morwyn</dc:creator>
<dc:date>2006-10-10T07:28:28Z</dc:date>
<dc:subject>html imported webpagedesign</dc:subject>
<taxo:topics>
<rdf:Bag>
<rdf:li resource="http://del.icio.us/tag/imported" />
<rdf:li resource="http://del.icio.us/tag/html" />
<rdf:li resource="http://del.icio.us/tag/webpagedesign" />
</rdf:Bag>
</taxo:topics>
</item>
I'm trying to get the content from <dc:subject> like this
doc = Hpricot.parse(File.read("965.xhtml"))
(doc/"item").each do |t|
puts (t/"dc:subject").innerTEXT
end
but I got
<dc:subject>html internet tutorial web</dc:subject>
while I only need "html internet tutorial web"
Anyone knows what's the right function to call?
THanks