XML problem with special characters like "<" and ">"

  • Thread starter Christian Schmidbauer
  • Start date
C

Christian Schmidbauer

Hello!

I prepare my XML document like this way:

-------------------------------------------------------
PrintWriter writer;
Document domDocument;
Element domElement;

// Root tag
domElement = domDocument.createElement ("ROOT_TAG");
domDocument.appendChild (domElement);

// XML from an external source as a "String"
Text data = domDocument.createTextNode (externalXML);
domElement.appendChild (data);

writer.println (...);
-------------------------------------------------------

As you can see, I create a normal Root-Node and then I get an XML
stream from an external source. For the external XML I use the
function "createTextNode" because it is a text in some way.

The problem is the output when I write all together to the PrintWriter
object. It looks like this for this example:

--------------------------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>

<ROOT_TAG>

&lt;DATA&gt;
&lt;AFL&gt;
&lt;AFLNR&gt;XX&lt;/AFLNR&gt;
&lt;BENENNUNG&gt;MY TEST&lt;/BENENNUNG&gt;
&lt;LA_VER&gt;&lt;/LA_VER&gt;
&lt;FA_KR&gt;&lt;/FA_KR&gt;
&lt;POL_COD&gt;&lt;/POL_COD&gt;
&lt;FA_KZ&gt;&lt;/FA_KZ&gt;
&lt;G_KZ&gt;&lt;/G_KZ&gt;
&lt;AFL_KZ&gt;1&lt;/AFL_KZ&gt;
&lt;/AFL&gt;
&lt;/DATA&gt;
</ROOT_TAG>
--------------------------------------------------------------

Strange, isn't it!? The sign "<" is being replaced by "&lt;" and ">"
is being replaced by "&gt;", but only for the XML coming from the
external source.

Does anybody know this problem or can think about a solution? Should I
use another function than "createTextNode" or do I have to change the
special characters manually?

Thank you for every hint!

Best regards,
Christian Schmidbauer
 
A

Andrew Thompson

The sign "<" is being replaced by "&lt;"

&lt; is the (proper) way to encode < if
you want it to appear in a web page/HTML.

That way the UA knows to treat it as a
presentational character, rather than the
closing char of an HTML tag.
Strange, isn't it!?

No.
 
R

Roedy Green

&lt; is the (proper) way to encode < if
you want it to appear in a web page/HTML.

That way the UA knows to treat it as a
presentational character, rather than the
closing char of an HTML tag.

But perhaps thoughtless or inconsiderate. Ideally you would arrange
things so that quoting would almost never be needed.

The problem is HTML grew up without ever knowing it would be merged
with Java. If we had this all to do over, HTML would use some rare
character to mark its tags like ~ or ` or !. Alternatively you would
generate your HTML with methods. It would not have reserved
characters.
 
Z

zoopy

Hello!

I prepare my XML document like this way:

-------------------------------------------------------
PrintWriter writer;
Document domDocument;
Element domElement;

// Root tag
domElement = domDocument.createElement ("ROOT_TAG");
domDocument.appendChild (domElement);

// XML from an external source as a "String"
Text data = domDocument.createTextNode (externalXML);
domElement.appendChild (data);

writer.println (...);
-------------------------------------------------------

As you can see, I create a normal Root-Node and then I get an XML
stream from an external source. For the external XML I use the
function "createTextNode" because it is a text in some way. ^^^^^^^^^^^


The problem is the output when I write all together to the PrintWriter
object. It looks like this for this example:

--------------------------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>

<ROOT_TAG>

&lt;DATA&gt;
&lt;AFL&gt;
&lt;AFLNR&gt;XX&lt;/AFLNR&gt;
&lt;BENENNUNG&gt;MY TEST&lt;/BENENNUNG&gt;
&lt;LA_VER&gt;&lt;/LA_VER&gt;
&lt;FA_KR&gt;&lt;/FA_KR&gt;
&lt;POL_COD&gt;&lt;/POL_COD&gt;
&lt;FA_KZ&gt;&lt;/FA_KZ&gt;
&lt;G_KZ&gt;&lt;/G_KZ&gt;
&lt;AFL_KZ&gt;1&lt;/AFL_KZ&gt;
&lt;/AFL&gt;
&lt;/DATA&gt;
</ROOT_TAG>
--------------------------------------------------------------

Strange, isn't it!? The sign "<" is being replaced by "&lt;" and ">"
is being replaced by "&gt;", but only for the XML coming from the
external source.

It isn't strange: you are treating the external XML not as XML but as text (as a string). Upon
output, characters with a special meaning in XML will be replaced by an entity reference (< becomes
&lt; etc.)

Does anybody know this problem or can think about a solution? Should I
use another function than "createTextNode" or do I have to change the
special characters manually?

You'll need to parse your external piece as a (partial) XML DOM tree and insert that into your
domDocument. I don't think the standard API allows you to parse a partial XML document (i.e. without
an <?xml ...?> declaration and a root element), so probably you'll have to add the declaration and
a root element to the string representing the external piece.

If you need more info on parsing and manipulating XML/DOM, see
Thank you for every hint!

Best regards,
Christian Schmidbauer

HTH,
Z.
 
C

Christian Schmidbauer

I don't want to show it within a web page! I definetely want to have
to real characters "<" respectively ">". How can I avoid the "&gt;"
and "&lt;" signs?

By the way, the XML is given back to the user.

Thank you,
Christian
 
A

Andrew Thompson

On 28 Jul 2004 07:20:26 -0700, Christian Schmidbauer wrote:

(Please do not top-post Christian,
as I find it most confusing..
<http://www.physci.org/codes/javafaq.jsp#netiquette>)

See further replies inline..
...
I don't want to show it within a web page! I definetely want to have
to real characters "<" respectively ">". How can I avoid the "&gt;"
and "&lt;" signs?

By the way, the XML is given back to the user.

I suspect you will find that the conversion
back to '<' happens on *read*, so your user
will get back exactly what they expect, but
it seems you are still not getting the basic
concept that (AFAIU) these symbols cannot be
written in the XML as '<' they would make the
XML invalid.
 
A

Andrea Spinelli

(e-mail address removed) (Christian Schmidbauer) wrote in
I prepare my XML document like this way:
// XML from an external source as a "String"
Text data = domDocument.createTextNode (externalXML);
domElement.appendChild (data);
<ROOT_TAG>

&lt;DATA&gt;

Strange, isn't it!? The sign "<" is being replaced by "&lt;" and ">"
is being replaced by "&gt;", but only for the XML coming from the
external source.

For clarity sake: suppose that externalXML is the string:

"I like women with weight <55kg and height>170cm"

Now DOM is shielding you from considering the text <55kg and height>
as a XML element, which in this case definitely isn't.

DOM is right; guess who is wrong! :)

I suspect you might find help in the DocumentFragment class,
which seems to me near to your needs.
 
H

Hemal Pandya

Hello!

I prepare my XML document like this way: [....]
Text data = domDocument.createTextNode (externalXML);
domElement.appendChild (data); [....]

Strange, isn't it!? The sign "<" is being replaced by "&lt;" and ">"
is being replaced by "&gt;", but only for the XML coming from the
external source.

That is not quite true. The replacement would have occured even if the
string was a literal. Try domDocument.createTextNode("5<7");
Does anybody know this problem

This is the semantics of using createTextNode. How would you,
otherwise put the string "The symbol for greater-than is >" in an XML
document using this method? The element name ROOT_TAG has been put
inside has been put inside said:
or can think about a solution? Should I use another function than
"createTextNode"

You could use importNode but as others have pointed out the external
document will have to be parsed first. Which will also not work if the
externalNode has been created as a String (not as an XML Document).
or do I have to change the special characters manually?

I can think of a kludge. Instead of inserting the variable
externalDocument as a TextNode insert some other marker there and then
later replace that marker with externalDocument, with
(String.replace?). This assumes that the externalXML is not going
break the final document, otherwise the receiving side might reject
it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top