Is it possible with xerces ?

Manuel Yguel · Feb 18, 2004

I try to parse an indented xml file with dom xerces c++.
the file is like that :
<root>
<child1>
<field1> foo </field1>
<field2> bar </field2>
</child1>
<child2>
<field1> foo </field1>
<field2> bar </field2>
</child2>
</root>

where return an white spaces are in the xml file. So the program I
writed with dom give me this tree :
root has five childs :
text-node child1 text-node child2 text-node

the text of the first text-node is "\n "
the text of the second text-node is "\n "
the text of the third text-node is "\n"

these text-node of spaces occurs at each step in the tree hierarchy.

Is it possible to strip these nodes automatically ?

XML standard question : does this xml code respects the xml standard ?

<child2> some text
<field1> foo </field1>
<field2> bar </field2>
</child2>

"some text" is in the same depth of field1 and field2 but is a text. So
there is a soap of text and element. I thougth that the text must be a
leaf of the tree ... So does it respects the standard ?

Thanks

Philippe Poulard · Feb 18, 2004

Manuel said:
I try to parse an indented xml file with dom xerces c++.
the file is like that :
<root>
<child1>
<field1> foo </field1>
<field2> bar </field2>
</child1>
<child2>
<field1> foo </field1>
<field2> bar </field2>
</child2>
</root>

where return an white spaces are in the xml file. So the program I
writed with dom give me this tree :
root has five childs :
text-node child1 text-node child2 text-node

the text of the first text-node is "\n "
the text of the second text-node is "\n "
the text of the third text-node is "\n"

these text-node of spaces occurs at each step in the tree hierarchy.

Is it possible to strip these nodes automatically ?

yes : there is an option that allows to strip ignorable whitespaces, but
you must give a grammar that defines where are ignorable whitespaces,
like this :

XML standard question : does this xml code respects the xml standard ?

<child2> some text
<field1> foo </field1>
<field2> bar </field2>
</child2>

"some text" is in the same depth of field1 and field2 but is a text. So
there is a soap of text and element. I thougth that the text must be a
leaf of the tree ... So does it respects the standard ?

yes : an element may contain :
-nothing (empty element)
-subelements
-text
-text and subelements

Thanks

--
Cordialement,

///
(. .)
-----ooO--(_)--Ooo-----
| Philippe Poulard |
-----------------------

Manuel Yguel · Feb 23, 2004

Philippe said:
yes : there is an option that allows to strip ignorable whitespaces, but
you must give a grammar that defines where are ignorable whitespaces,
like this :

<!ELEMENT root (child1,child2)>

thanks, but after how do you use the grammar with the parser ?

Philippe Poulard · Feb 24, 2004

Manuel said:
thanks, but after how do you use the grammar with the parser ?

use the <!DOCTYPE> declaration
you should have a look at the spec
--
Cordialement,

///
(. .)
-----ooO--(_)--Ooo-----
| Philippe Poulard |
-----------------------

how to manage an indented file with dom.	1	Feb 16, 2004
Is it possible an iframe can overlapp another?	3	Apr 20, 2022
xerces perl getNodeValue problem	0	Jul 6, 2004
How to write XML declaration with DOMWriter class Xerces-c	3	Mar 5, 2007
Xerces c++ getNodeType() Problem..	5	Nov 8, 2006
xerces serializing <	3	Nov 2, 2005
[XSL] how could I know node attributes???	1	Oct 17, 2006
Xerces C++ Problem	3	May 10, 2008

Is it possible with xerces ?

Manuel Yguel

Philippe Poulard

Manuel Yguel

Philippe Poulard

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads