Albert said:
Hi, what are the pros of using CDATA, like in 1/ over regular text, like
in 2/ ?
1/ <element><![CDATA[foo]]></element>
2/ <element>foo</element>
CDATA Sections, which XML inherited from SGML, are a sloppy alternative
for escaping individual characters, when the text contains something
that would break XML syntax -- typically the <, >, and & characters. It
exists mostly as a quick-and-dirty mechanism for cases where you're
inserting data into an XML file via text processing (eg cut-and-paste)
rather than by using an XML-aware tool or library which will handle
converting those to <, > and & for you automagically.
XML programs consider CDATA Sections to be semantically identical to
ordinary text, and may not preserve this distinction -- that is,
something read in as <![CDATA[]]> may be written out with the
troublesome characters individually escaped instead.
Note that CDATA Sections are actually somewhat fragile. If you try to
use one to contain XML markup (a bad practice, but alas not uncommon),
you'll find that if the contained XML also contains a <![CDATA[]]> its
trailing ]]> will prematurely exit the CDATA Section and the document
will not parse correctly. There are things you can do with exiting and
re-entering CDATA Sections to avoid that breakage, but by the time
you've done that much work it might have been easier to just
individually escape the characters in the first place.
The only other argument in favor of <![CDATA[]]> is that, if the
contained text contains a HUGE number of <>& characters, the
<![CDATA[]]> syntax may be somewhat more compact. However, that
advantage vanishes if the XML document is transmitted in compressed form.
Summary: Don't use <![CDATA[]]> unless you really need it -- and if you
think you need it you're probably wrong.