Strip CDATA with regex

B

Balaras

Hi,

Can sombody here please help me a bit with a regular expression.
I have a xml file where I need to strip the CDATA sections of any
contained data.

Eg.
<xml>
<tag><[CDATA[ some data ]]></tag>
<tag><[CDATA[ some more data ]]></tag>
</xml>

Should end up like this:
<xml>
<tag><[CDATA[]]></tag>
<tag><[CDATA[]]></tag>
</xml>

Now, I have the start and end of the range
(\[CDATA\[)
and
(\]\]>)

But I cannot figure out how I match any character that is not like the
end of the range.

That is > is ok, ] is ok
but ]]> is not ok.

Thanks in advance,
Balaras
 
M

Martin Honnen

Balaras wrote:

Can sombody here please help me a bit with a regular expression.
I have a xml file where I need to strip the CDATA sections of any
contained data.

Eg.
<xml>
<tag><[CDATA[ some data ]]></tag>
It should be
<![CDATA[
<tag><[CDATA[ some more data ]]></tag>
</xml>

Should end up like this:
<xml>
<tag><[CDATA[]]></tag>
<tag><[CDATA[]]></tag>
</xml>

How about parsing the XML into a DOM document and then manipulating
those CDATA section nodes and serializing back, Mozilla example:

var xmlMarkup = [
'<xml>',
'<tag><![CDATA[ some data ]]></tag>',
'<tag><![CDATA[ some more data ]]></tag>',
'</xml>'
].join('\r\n');

var xmlDocument = new DOMParser().parseFromString(xmlMarkup,
'application/xml');

var tagElements = xmlDocument.getElementsByTagName('tag');
for (var i = 0; i < tagElements.length; i++) {
var cdataSection = tagElements.firstChild;
if (cdataSection.nodeType == 4) {
cdataSection.data = '';
}
}

var newXmlMarkup = new XMLSerializer().serializeToString(xmlDocument);

That yields

<xml>
<tag><![CDATA[]]></tag>
<tag><![CDATA[]]></tag>
</xml>
 
B

Balaras

Thanks Martin,

Actually I posted this to c.l.javascript by accident, it was ment for a
php group. I have to do some preprocessing before the xml is sent to the
client.

However your post helped me in another manner :)
var newXmlMarkup = new XMLSerializer().serializeToString(xmlDocument);

I did not know about the XMLSerializer, and I need it :)

Does IE have an equivallent or does a .innerHTML return valid xml ?

/Balaras
 
M

Martin Honnen

Balaras said:
I did not know about the XMLSerializer, and I need it :)

Does IE have an equivallent or does a .innerHTML return valid xml ?

An XML DOM document (or any XML DOM node) with IE has a property named
xml which gives you the serialized markup so with IE/MSXML you can use
xmlDocument.xml
to get the markup.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,231
Members
46,820
Latest member
GilbertoA5

Latest Threads

Top