Pros of CDATA ?

A

Albert

Hi, what are the pros of using CDATA, like in 1/ over regular text, like
in 2/ ?

1/ <element><![CDATA[foo]]></element>
2/ <element>foo</element>
 
M

Martin Honnen

Albert said:
Hi, what are the pros of using CDATA, like in 1/ over regular text, like
in 2/ ?

1/ <element><![CDATA[foo]]></element>
2/ <element>foo</element>

In that example you do not gain anything. However if you want to use
characters that otherwise needed to be escaped then using a CDATA
section might make typing and reading easier e.g.
<expression><![CDATA[a < b && b < c]]></expression>
compared to
<expression>a &lt; b &amp;&amp; b &lt; c</expression>
 
J

Joe Kesselman

Albert said:
Hi, what are the pros of using CDATA, like in 1/ over regular text, like
in 2/ ?

1/ <element><![CDATA[foo]]></element>
2/ <element>foo</element>

CDATA Sections, which XML inherited from SGML, are a sloppy alternative
for escaping individual characters, when the text contains something
that would break XML syntax -- typically the <, >, and & characters. It
exists mostly as a quick-and-dirty mechanism for cases where you're
inserting data into an XML file via text processing (eg cut-and-paste)
rather than by using an XML-aware tool or library which will handle
converting those to &lt;, &gt; and &amp; for you automagically.

XML programs consider CDATA Sections to be semantically identical to
ordinary text, and may not preserve this distinction -- that is,
something read in as <![CDATA[]]> may be written out with the
troublesome characters individually escaped instead.

Note that CDATA Sections are actually somewhat fragile. If you try to
use one to contain XML markup (a bad practice, but alas not uncommon),
you'll find that if the contained XML also contains a <![CDATA[]]> its
trailing ]]> will prematurely exit the CDATA Section and the document
will not parse correctly. There are things you can do with exiting and
re-entering CDATA Sections to avoid that breakage, but by the time
you've done that much work it might have been easier to just
individually escape the characters in the first place.

The only other argument in favor of <![CDATA[]]> is that, if the
contained text contains a HUGE number of <>& characters, the
<![CDATA[]]> syntax may be somewhat more compact. However, that
advantage vanishes if the XML document is transmitted in compressed form.

Summary: Don't use <![CDATA[]]> unless you really need it -- and if you
think you need it you're probably wrong.
 
R

Richard Tobin

Joe Kesselman said:
It
exists mostly as a quick-and-dirty mechanism for cases where you're
inserting data into an XML file via text processing (eg cut-and-paste)
rather than by using an XML-aware tool or library which will handle
converting those to &lt;, &gt; and &amp; for you automagically. [...]
The only other argument in favor of <![CDATA[]]> is that, if the
contained text contains a HUGE number of <>& characters, the
<![CDATA[]]> syntax may be somewhat more compact.

I think you are missing the most justifiable use of CDATA sections:
for human readability, which was one of the original gials of XML. If
your XML is "data" that is produced and consumed only by programs,
then there's no point in CDATA. But if your document is text that is
intended to be readable and editable by humans, then a CDATA section
can be very useful. Consider a document that contains an en example
of some XML or HTML.

(Of course, sooner or later you may want an example of XML containing
a CDATA section, but that's comparatively rare.)

-- Richard
 
J

Joe Kesselman

Richard said:
I think you are missing the most justifiable use of CDATA sections:
for human readability, which was one of the original goals of XML.

Valid point... but realistically, that's much less of an issue than it
once was. There's a lot less hand-constructed/hand-maintained XML than
there was in its early days, since tooling has been improving over the
years. And those who are working with hand-maintained files have gotten
very used to reading "&amp;" and mentally translating it to "&".

I admit I was exaggerating a bit for effect... but in general, I still
believe that "if you do not know you need a CDATA Section, you know that
you do not need a CDATA Section."
 
P

Peter Flynn

Joe said:
Valid point... but realistically, that's much less of an issue than it
once was. There's a lot less hand-constructed/hand-maintained XML than
there was in its early days, since tooling has been improving over the
years. And those who are working with hand-maintained files have gotten
very used to reading "&amp;" and mentally translating it to "&".

I admit I was exaggerating a bit for effect... but in general, I still
believe that "if you do not know you need a CDATA Section, you know that
you do not need a CDATA Section."

That about sums it up (and I'd like to quote you in the FAQ on that :)
Apart from documentation authors who need to embed chunks of markup code
(an important use case), the "need" for CDATA usually means the
suppliers of the data have not done their homework.

///Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top