html xml extractor

F

FC

Marco said:
Hi,

I am searching for a tool that extract information from a HTLM page
and format it in xml format

For instance for this page:
http://money.guardian.co.uk/pensions/story/0,6453,993138,00.html

get an xml file
with a <title> with the title of the article
with a <text> with the text of the article
with a <auuthor> with the text of the article

Do you know such a tool?

Marco


There is a tool called HTML tidy, if I am not wrong, it converts from HTML
into XHTML.
Everything else is up to you.
Search using HTML tidy.

Bye,
Flavio
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,990
Messages
2,570,211
Members
46,796
Latest member
SteveBreed

Latest Threads

Top