HTML parsing with Xerces

H

Hans Bijvoet

Hello,
I'm trying to parse a HTML document with the SAX parser from Xerces.
The parser throws a fatal error when attribute values in the document are
not surrounded by quotes?
How can I prevent this parser's behaviour?
Greetings,
Hans
 
S

Stanimir Stamenkov

/Hans Bijvoet/:
I'm trying to parse a HTML document with the SAX parser from Xerces.
The parser throws a fatal error when attribute values in the document are
not surrounded by quotes?
How can I prevent this parser's behaviour?

Which Xerces? Perhaps using a parser configuration which uses a HTML
scanner, as you know HTML is not XML compatible. For Java there's
one from Andy Clarck's CyberNeko Tools for XNI (I haven't tried it
myself, though):

http://www.apache.org/~andyc/neko/doc/html/index.html

You may also consider posting Xerces specific questions to the
Xerces User mailing list:

http://xml.apache.org/mail.html#xerces-j-user
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top