ASCII control characters in CDATA section

N

nowhere

Hi,

I need to preserve some ASCII control characters (CR and LF) within an
XML file so I have included the data in a CDATA section. However,
when parsing it using expat, I lose the CR characters.

My question is: Should I be using a different character set (not
UTF-8) or is this a bug in expat?

TIA, Mark
 
M

Martin Honnen

I need to preserve some ASCII control characters (CR and LF) within an
XML file so I have included the data in a CDATA section. However,
when parsing it using expat, I lose the CR characters.

My question is: Should I be using a different character set (not
UTF-8) or is this a bug in expat?

I don't think so, with XML all line endings are normalized
http://www.w3.org/TR/REC-xml#sec-line-ends
so even a CDATA section doesn't help to preserve a carriage return
 
R

Richard Tobin

I need to preserve some ASCII control characters (CR and LF) within an
XML file so I have included the data in a CDATA section. However,
when parsing it using expat, I lose the CR characters.

To preserve CRs, you need to use character references (
), because
CR and CR-LF are normalized to LF when an XML document is read.

You can't use character references in a CDATA section, so it's
probably better to forget about CDATA and just escape any characters
that need it. The main use for CDATA is preserving human readability
of text that includes < and & characters, such as XML examples inside
an XML document. It doesn't really work for arbitrary data.

-- Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,818
Latest member
SapanaCarpetStudio

Latest Threads

Top