How to retrieve XML CDATA text contents by org.xml.sax.ext.DefaultHandler2?

R

RC

For example I have a XML tag

<script>
<![CDATA[
My script is here
]]>
</script>

I am using org.xml.sax.ext.DefaultHandler2 to parse my XML
file. How do I retrieve my script contents?




What shall I do in these two methods?
@Override
public void startElement(String uri, String localName, String qName,
Attributes attributes)
throws SAXException
{
if (qName.equals("script"))
{
// How to retrieve my script contents?
}
}
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException
{
if (qName.equals("script"))
{
// How to retrieve my script contents?
}
}



Below two methods have no print out at all
@Override
public void endCDATA()
{
System.out.println("End of CDATA");
}

@Override
public void startCDATA()
{
System.out.println("Start of CDATA");
}

Thank you very much in advance!
 
L

Lew

RC said:
For example I have a XML tag

<script>
<![CDATA[
My script is here
]]>
</script>

I am using org.xml.sax.ext.DefaultHandler2 to parse my XML
file. How do I retrieve my script contents?

Via the 'characters()' method.
What shall I do in these two methods?

Mark the beginning and end of each element so that your parser knows
where it is in the parse process.
@Override
public void startElement(String uri, String localName, String qName,
Attributes attributes)
throws SAXException
{
        if (qName.equals("script"))
        {
                // How to retrieve my script contents?

Not here. What do the Javadocs tell you about the purpose of this
method and the event it handles?
        }}

@Override
public void endElement(String uri, String localName, String qName)
throws SAXException
{
        if (qName.equals("script"))
        {
                // How to retrieve my script contents?

Not here. What do the Javadocs tell you about the purpose of this
method and the event it handles?
        }

}

Below two methods have no print out at all

Did you read the Javadocs?
@Override
public void endCDATA()
{
        System.out.println("End of CDATA");
}

@Override
public void startCDATA()
{
        System.out.println("Start of CDATA");
}

The Javadocs will tell you:
The contents of the CDATA section will be reported through the regular
characters event; this event is intended only to report the boundary.

While not always enough, the API Javadocs are always a good place to
start, and often will completely answer your questions.
 
S

Stanimir Stamenkov

Thu, 30 Apr 2009 12:02:36 -0400, /RC/:
For example I have a XML tag

<script>
<![CDATA[
My script is here
]]>
</script>

I am using org.xml.sax.ext.DefaultHandler2 to parse my XML
file. How do I retrieve my script contents?

You retrieve it as ordinary text content delivered through
'characters' events to your ContentHandler. Whether the text is
written as CDATA section (or not) in the source is purely a
syntactic detail which shouldn't bother you.
Below two methods have no print out at all
@Override
public void endCDATA()
{
System.out.println("End of CDATA");
}

@Override
public void startCDATA()
{
System.out.println("Start of CDATA");
}

Thank you very much in advance!

You need to set the "lexical-handler" [1] property of the parser
with the reference to your handler in addition to setting it as a
'contentHandler':

XMLReader parser;
DefaultHandler2 myHandler;
...
parser.setContentHandler(myHandler);
parser.setProperty("http://xml.org/sax/properties/"
+ "lexical-handler", myHandler);

[1] SAX2 Standard Handler and Property IDs
<http://www.saxproject.org/apidoc/org/xml/sax/package-summary.html>
 
L

Lew

Stanimir said:
You need to set the "lexical-handler" [1] property of the parser with
the reference to your handler in addition to setting it as a
'contentHandler':

Are you sure about that?
 
J

John B. Matthews

[QUOTE="Lew said:
You need to set the "lexical-handler" [1] property of the parser
with the reference to your handler in addition to setting it as a
'contentHandler':

Are you sure about that?[/QUOTE]

I was surprised to see that the default value of lexical-handler is
unspecified [1]. On closer reading, I see that the LexicalHandler
interface is optional [2]. The API suggests setting the property and
handling any SAXNotRecognizedException to determine if the feature is
implemented.

[1]<http://www.saxproject.org/apidoc/org/xml/sax/package-summary.html>
[2]<http://www.saxproject.org/apidoc/org/xml/sax/ext/LexicalHandler.html>
 
S

Stanimir Stamenkov

Mon, 04 May 2009 09:39:44 -0400, /Lew/:
Stanimir said:
You need to set the "lexical-handler" [1] property of the parser with
the reference to your handler in addition to setting it as a
'contentHandler':

Are you sure about that?

Yes. As you've suggested you may consult with the API docs
reference to which I've supplied. If you perform a simple test
you'll see for yourself, too. Note I've meant one needs to set a
"lexical-handler" only to detect CDATA section boundaries, i.e. to
receive 'startCDATA' and 'endCDATA' events, not as requirement to
read the content of CDATA sections (if that wasn't clear).
 
L

Lew

Stanimir said:
Note I've meant one needs to set a "lexical-handler"
only to detect CDATA section boundaries, i.e. to receive 'startCDATA'
and 'endCDATA' events, not as requirement to read the content of CDATA
sections (if that wasn't clear).

Thanks, that wasn't.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top