Parsing HTML document, how?

G

George K

This what my program should do, you give it the URL to a page and a
template file, it downloads that page and then using the template file it
returns some information.

The way I thought of doing it was that the template file uses regex and
then in my program I just do re.search(template, htmlpage) and this would
work but the HTML document has characters like ? and * that I need to
escape in the template, so this solution doesn't work. What is a better
way to accomplish what I want? does Python have any standard library for
this?

The parsing has to be dynamic, from the template file, the URLs are not
fixed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,208
Messages
2,571,082
Members
47,683
Latest member
AustinFairchild

Latest Threads

Top