N
Nomak
Hello,
i'm reading XML files (with Xerces SAX2). The thing is the strings are read as ASCII (8bits) instead of UTF-8 while UTF-8 is specified as the encoding of the XML file.
I googled a little bit but i didn't find THE way you must read strings from XML in java, so i'm asking.
Here is my base code:
parserClassName = "org.apache.xerces.parsers.SAXParser";
....
XMLReader reader = null;
try {
reader = XMLReaderFactory.createXMLReader(parserClassName);
} catch (Exception ex) {
ex.printStackTrace();
}
try {
try {
reader.setFeature("http://xml.org/sax/features/validation", true);
} catch (SAXException ex) {
ex.printStackTrace();
}
reader.setContentHandler(myContentHandler);
reader.setErrorHandler(myErrorHandler);
InputSource inputSource = new InputSource(xmlURI);
System.err.println("encoding = " + inputSource.getEncoding());
System.err.println("public id = " + inputSource.getPublicId());
System.err.println("system id = " + inputSource.getSystemId());
reader.parse(inputSource);
// String charsetName = reader...getCharset();
}
what must i add/remove/modify to get my strings properly?
TIA
i'm reading XML files (with Xerces SAX2). The thing is the strings are read as ASCII (8bits) instead of UTF-8 while UTF-8 is specified as the encoding of the XML file.
I googled a little bit but i didn't find THE way you must read strings from XML in java, so i'm asking.
Here is my base code:
parserClassName = "org.apache.xerces.parsers.SAXParser";
....
XMLReader reader = null;
try {
reader = XMLReaderFactory.createXMLReader(parserClassName);
} catch (Exception ex) {
ex.printStackTrace();
}
try {
try {
reader.setFeature("http://xml.org/sax/features/validation", true);
} catch (SAXException ex) {
ex.printStackTrace();
}
reader.setContentHandler(myContentHandler);
reader.setErrorHandler(myErrorHandler);
InputSource inputSource = new InputSource(xmlURI);
System.err.println("encoding = " + inputSource.getEncoding());
System.err.println("public id = " + inputSource.getPublicId());
System.err.println("system id = " + inputSource.getSystemId());
reader.parse(inputSource);
// String charsetName = reader...getCharset();
}
what must i add/remove/modify to get my strings properly?
TIA