xml.etree - why no HTMLTreeBuilder included?

J

Jon P.

It is great that Fredrik Lundh's ElementTree is now a part of the
Python Standard Library.

However, Is it correct that if you want to use xml.etree.ElementTree
to parse an HTML Document that you will have to install a separate
HTMLTreeBuilder (e.g. TidyHTMLTreeBuilder) and that the only
TreeBuilder objects that come with the Standard Library is the one for
XML source?

Seems like some kind of HTMLTreeBuilder ought to be included by
default.

For a script I'm doing which deals with HTML, I thought I could
jettison lxml and use xml.etree instead, but since I would need to
have to ask the end-user to install an external library anyways even
if I use xml.etree, I switched back to lxml.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top