U
Urs Muntwyler
Hi there
I have to check if the content of a file is a well-formed XML
document. Since the XML documents can be large, I'm using SAX to
perform this task.
Using Java, my code looks (somehow) like this:
public void checkForWellFormedness(File file)
{
SAXParser saxParser;
DefaultHandler dh;
// init parser
try {
SAXParserFactory spfactory = SAXParserFactory.newInstance();
saxParser = spfactory.newSAXParser();
dh = new DefaultHandler();
}
catch(Exception e) {
System.out.println("Cannot initialize SAX parser.");
e.printStackTrace();
}
// parse the XML document using SAX parser
try {
saxParser.parse(file,dh); // SAXException, IOException
}
catch(SAXException se) { // (*)
// only invoked in case of fatalError()
// what if error() occur? Is the XML document well-formed?
System.out.println("Document is not well-formed.");
se.printStackTrace();
}
catch(IOException ioe) {
System.out.println("Cannot read file.");
ioe.printStackTrace();
}
}
In the above code, you can see that I'm using the DefaultHandler
class, which implements the ErrorHandler interface. In this class, the
default implementation of fatalError() throws a SAXParseException,
which I'm catching in the above catch block (*). The default
implementation of error() does nothing, i.e. it does not throw a
SAXParseException. (I know what to do if I want that behaviour:
subclass DefaultHandler and overwrite the method).
My question is simple: what does it mean to have a well-formed XML
document? Does it mean to have no fatalError() occurring during
parsing (which means that the above code is ok), or is an XML document
well-formed if no fatalError() AND no error() occur? (which means that
I have to subclass DefaultHandler and overwrite method error() to
throw a SAXParseException so it is catched in the catch-block (*)).
Thanks for your help!
I have to check if the content of a file is a well-formed XML
document. Since the XML documents can be large, I'm using SAX to
perform this task.
Using Java, my code looks (somehow) like this:
public void checkForWellFormedness(File file)
{
SAXParser saxParser;
DefaultHandler dh;
// init parser
try {
SAXParserFactory spfactory = SAXParserFactory.newInstance();
saxParser = spfactory.newSAXParser();
dh = new DefaultHandler();
}
catch(Exception e) {
System.out.println("Cannot initialize SAX parser.");
e.printStackTrace();
}
// parse the XML document using SAX parser
try {
saxParser.parse(file,dh); // SAXException, IOException
}
catch(SAXException se) { // (*)
// only invoked in case of fatalError()
// what if error() occur? Is the XML document well-formed?
System.out.println("Document is not well-formed.");
se.printStackTrace();
}
catch(IOException ioe) {
System.out.println("Cannot read file.");
ioe.printStackTrace();
}
}
In the above code, you can see that I'm using the DefaultHandler
class, which implements the ErrorHandler interface. In this class, the
default implementation of fatalError() throws a SAXParseException,
which I'm catching in the above catch block (*). The default
implementation of error() does nothing, i.e. it does not throw a
SAXParseException. (I know what to do if I want that behaviour:
subclass DefaultHandler and overwrite the method).
My question is simple: what does it mean to have a well-formed XML
document? Does it mean to have no fatalError() occurring during
parsing (which means that the above code is ok), or is an XML document
well-formed if no fatalError() AND no error() occur? (which means that
I have to subclass DefaultHandler and overwrite method error() to
throw a SAXParseException so it is catched in the catch-block (*)).
Thanks for your help!