Pros of CDATA ?

Albert · Apr 17, 2009

Hi, what are the pros of using CDATA, like in 1/ over regular text, like
in 2/ ?

1/ <element><![CDATA[foo]]></element>
2/ <element>foo</element>

Martin Honnen · Apr 17, 2009

Albert said:
Hi, what are the pros of using CDATA, like in 1/ over regular text, like
in 2/ ?

1/ <element><![CDATA[foo]]></element>
2/ <element>foo</element>

In that example you do not gain anything. However if you want to use
characters that otherwise needed to be escaped then using a CDATA
section might make typing and reading easier e.g.
<expression><![CDATA[a < b && b < c]]></expression>
compared to
<expression>a < b && b < c</expression>

Joe Kesselman · Apr 17, 2009

Albert said:
Hi, what are the pros of using CDATA, like in 1/ over regular text, like
in 2/ ?

1/ <element><![CDATA[foo]]></element>
2/ <element>foo</element>

CDATA Sections, which XML inherited from SGML, are a sloppy alternative
for escaping individual characters, when the text contains something
that would break XML syntax -- typically the <, >, and & characters. It
exists mostly as a quick-and-dirty mechanism for cases where you're
inserting data into an XML file via text processing (eg cut-and-paste)
rather than by using an XML-aware tool or library which will handle
converting those to <, > and & for you automagically.

XML programs consider CDATA Sections to be semantically identical to
ordinary text, and may not preserve this distinction -- that is,
something read in as <![CDATA[]]> may be written out with the
troublesome characters individually escaped instead.

Note that CDATA Sections are actually somewhat fragile. If you try to
use one to contain XML markup (a bad practice, but alas not uncommon),
you'll find that if the contained XML also contains a <![CDATA[]]> its
trailing ]]> will prematurely exit the CDATA Section and the document
will not parse correctly. There are things you can do with exiting and
re-entering CDATA Sections to avoid that breakage, but by the time
you've done that much work it might have been easier to just
individually escape the characters in the first place.

The only other argument in favor of <![CDATA[]]> is that, if the
contained text contains a HUGE number of <>& characters, the
<![CDATA[]]> syntax may be somewhat more compact. However, that
advantage vanishes if the XML document is transmitted in compressed form.

Summary: Don't use <![CDATA[]]> unless you really need it -- and if you
think you need it you're probably wrong.

Richard Tobin · Apr 17, 2009

Joe Kesselman said:
It
exists mostly as a quick-and-dirty mechanism for cases where you're
inserting data into an XML file via text processing (eg cut-and-paste)
rather than by using an XML-aware tool or library which will handle
converting those to <, > and & for you automagically. [...]
The only other argument in favor of <![CDATA[]]> is that, if the
contained text contains a HUGE number of <>& characters, the
<![CDATA[]]> syntax may be somewhat more compact.

I think you are missing the most justifiable use of CDATA sections:
for human readability, which was one of the original gials of XML. If
your XML is "data" that is produced and consumed only by programs,
then there's no point in CDATA. But if your document is text that is
intended to be readable and editable by humans, then a CDATA section
can be very useful. Consider a document that contains an en example
of some XML or HTML.

(Of course, sooner or later you may want an example of XML containing
a CDATA section, but that's comparatively rare.)

-- Richard

Joe Kesselman · Apr 18, 2009

Richard said:
I think you are missing the most justifiable use of CDATA sections:
for human readability, which was one of the original goals of XML.

Valid point... but realistically, that's much less of an issue than it
once was. There's a lot less hand-constructed/hand-maintained XML than
there was in its early days, since tooling has been improving over the
years. And those who are working with hand-maintained files have gotten
very used to reading "&" and mentally translating it to "&".

I admit I was exaggerating a bit for effect... but in general, I still
believe that "if you do not know you need a CDATA Section, you know that
you do not need a CDATA Section."

Peter Flynn · Apr 26, 2009

Joe said:
Valid point... but realistically, that's much less of an issue than it
once was. There's a lot less hand-constructed/hand-maintained XML than
there was in its early days, since tooling has been improving over the
years. And those who are working with hand-maintained files have gotten
very used to reading "&" and mentally translating it to "&".

I admit I was exaggerating a bit for effect... but in general, I still
believe that "if you do not know you need a CDATA Section, you know that
you do not need a CDATA Section."

That about sums it up (and I'd like to quote you in the FAQ on that

Apart from documentation authors who need to embed chunks of markup code
(an important use case), the "need" for CDATA usually means the
suppliers of the data have not done their homework.

///Peter

Use of undefined constant error	2	Jun 30, 2022
Python usage of data transfer objects	0	Nov 15, 2022
PHP RSS Feed Aggregator changing to todays date everytime feed is aggregated	1	Jan 11, 2022
Parsing cdata using expat in C	0	Mar 27, 2012
XSLT2.0 Copy of CDATA into txt file under Windows and using it withUnix/Linux	1	Jan 16, 2008
CDATA output problem	2	Feb 26, 2008
DTD content to represent<![CDATA[..]]>	5	Aug 2, 2007
Mandatory Elements To Conduct JavaScript Form Manipulation	7	Aug 22, 2023

Pros of CDATA ?

Albert

Martin Honnen

Joe Kesselman

Richard Tobin

Joe Kesselman

Peter Flynn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads