org.w3c.dom.NodeList - empty nodes?

Michael Preminger · Apr 10, 2005

Hello!

The question is a bit lengthy (for completeness) but actually quite simple.

I have a very simple xml document Im experimenting with:

<?xml version="1.0"?>
<metadata xmlns="http://purl.org/dc/elements/1.1/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://dublincore.org/schemas/xmls/simpledc20021212.xsd"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>
UKOLN
</dc:title>
<dc:description>
UKOLN is a national focus of expertise in digital information
management. It provides policy, research and awareness services
to the UK library, information and cultural heritage communities.
UKOLN is based at the University of Bath.
</dc:description>
<dc

ublisher>
UKOLN, University of Bath
</dc

ublisher>
<dc:identifier>
http://www.ukoln.ac.uk/
</dc:identifier>
</metadata>

The root element is tagged <metadata>, and I am looping through its
childNodes.
Element docElem=document.getDocumentElement();
System.out.println("Document element: " +
docElem.getNodeName());
NodeList nl=docElem.getChildNodes();

for(int i=0;i<nl.getLength();i++){
Node nd=nl.item(i);
System.out.println(i+" "+nd);
}

Unexpectedly, I get the following output, where every even node seems
devoid of contents.
------------------------------------------------
Document element:metadata

0

1 <dc:title>
UKOLN
</dc:title>
2

3 <dc:description>
UKOLN is a national focus of expertise in digital information
management. It provides policy, research and awareness services
to the UK library, information and cultural heritage communities.
UKOLN is based at the University of Bath.
</dc:description>
4

5 <dc

ublisher>
UKOLN, University of Bath
</dc

ublisher>
6

7 <dc:identifier>
http://www.ukoln.ac.uk/
</dc:identifier>
8
---------------------------------------------------------------------------
I thought that the "void" nodes were the text nodes descendent to the
<dc:> elements. (they have a NODE_TYPE 1).
When I descent into one of the nodes (dc

ublisher) :
if (i==5){
NodeList nl5=nd.getChildNodes();
for(int k=0; k<nl5.getLength(); k++){

System.out.println("k:"+k+" "+nl5.item(k));
}
}
Then I actually get the text "UKOLN, University of Bath".
To me this means that the void even nodes are not the text nodes. (I get
nothing when I try to type-cast them into Text and run getData())

If so: what are they?
If they are the text nodes: Why isnt their content printed to the
standard output

Thanks

Michael

Martin Honnen · Apr 10, 2005

Michael Preminger wrote:

I have a very simple xml document Im experimenting with:

<?xml version="1.0"?>
<metadata xmlns="http://purl.org/dc/elements/1.1/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://dublincore.org/schemas/xmls/simpledc20021212.xsd"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>
UKOLN
</dc:title>
<dc:description>
UKOLN is a national focus of expertise in digital information
management. It provides policy, research and awareness services
to the UK library, information and cultural heritage communities.
UKOLN is based at the University of Bath.
</dc:description>
<dcublisher>
UKOLN, University of Bath
</dcublisher>
<dc:identifier>
http://www.ukoln.ac.uk/
</dc:identifier>
</metadata>

NodeList nl=docElem.getChildNodes();

for(int i=0;i<nl.getLength();i++){
Node nd=nl.item(i);
System.out.println(i+" "+nd);
}

Unexpectedly, I get the following output, where every even node seems
devoid of contents.
------------------------------------------------
Document element:metadata

0

1 <dc:title>
UKOLN
</dc:title>
2

I thought that the "void" nodes were the text nodes descendent to the
<dc:> elements. (they have a NODE_TYPE 1).

No, what you see in the DOM are white space text nodes between the
element nodes e.g. if you have
<gods><god>Kibo</god><god>Xibo</god></gods>
then you have only element nodes, there is the document element node
(<gods>) and it has two child nodes which are again element nodes. But
usually for easier reading such XML is written as
<gods>
<god>Kibo</god>
<god>Xibo</god>
</gods>
and then the document element node (<gods>) has five child nodes, a text
node with whitespace, an element node (<god>), a text node with white
space, an element node (<god), and a text node with white space.

John C. Bollinger · Apr 11, 2005

Martin said:
No, what you see in the DOM are white space text nodes between the
element nodes e.g. if you have
<gods><god>Kibo</god><god>Xibo</god></gods>
then you have only element nodes, there is the document element node
(<gods>) and it has two child nodes which are again element nodes. But
usually for easier reading such XML is written as
<gods>
<god>Kibo</god>
<god>Xibo</god>
</gods>
and then the document element node (<gods>) has five child nodes, a text
node with whitespace, an element node (<god>), a text node with white
space, an element node (<god), and a text node with white space.

Exactly right. Note, however, that a parser operating in "validating"
mode may be able to avoid creating the text nodes in question, so you
cannot assume that they will always be there. A validating parser does
require a DTD / schema, however, so if there is none then you should
expect to see the extra nodes.

Note also that you are not assured that the entire text content of an
element node will be contained in a single text node, even if the
element contains nothing but text. Furthermore note that if you need to
be general in your handling of the DOM tree then you also need to worry
about CDATA nodes wherever you permit text, and even mixed CDATA nodes
and text.

Call for Papers Reminder (extended): The 2013 InternationalConference of Data Mining and Knowledge E	0	Mar 10, 2013
Call for Papers Reminder (extended): The 2013 InternationalConference of Signal and Image Engineerin	0	Mar 12, 2013
Call for Papers Reminder: The 2013 International Conference of Signaland Image Engineering (ICSIE 20	0	Feb 28, 2013
Call for Papers Reminder: The 2013 International Conference ofInformation Security and Internet Engi	0	Feb 18, 2013
Call for Papers Reminder: The 2013 International Conference of DataMining and Knowledge Engineering	0	Feb 12, 2013
Call for Papers Reminder: The 2013 International Conference ofComputational Intelligence and Intelli	0	Feb 8, 2013
Call for Papers Reminder: The 2013 International Conference ofParallel and Distributed Computing (IC	0	Feb 25, 2013
Call for Papers Reminder (extended): The 2013 InternationalConference of Parallel and Distributed Co	0	Mar 11, 2013

org.w3c.dom.NodeList - empty nodes?

Michael Preminger

Martin Honnen

John C. Bollinger

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads