S
sim.sim
Hi all.
i'm faced to trouble using minidom:
#i have a string (xml) within CDATA section, and the section includes
"\r\n":
iInStr = '<?xml version="1.0"?>\n<Data><![CDATA[BEGIN:VCALENDAR\r
\nEND:VCALENDAR\r\n]]></Data>\n'
#After i create DOM-object, i get the value of "Data" without "\r\n"
from xml.dom import minidom
iDoc = minidom.parseString(iInStr)
iDoc.childNodes[0].childNodes[0].data # it gives u'BEGIN:VCALENDAR
\nEND:VCALENDAR\n'
according to http://www.w3.org/TR/REC-xml/#sec-line-ends
it looks normal, but another part of the documentation says that "only
the CDEnd string is recognized as markup": http://www.w3.org/TR/REC-xml/#sec-cdata-sect
so parser must (IMHO) give the value of CDATA-section "as is" (neither
both of parts of the document do not contradicts to each other).
How to get the value of CDATA-section with preserved all symbols
within? (perhaps use another parser - which one?)
Many thanks for any help.
i'm faced to trouble using minidom:
#i have a string (xml) within CDATA section, and the section includes
"\r\n":
iInStr = '<?xml version="1.0"?>\n<Data><![CDATA[BEGIN:VCALENDAR\r
\nEND:VCALENDAR\r\n]]></Data>\n'
#After i create DOM-object, i get the value of "Data" without "\r\n"
from xml.dom import minidom
iDoc = minidom.parseString(iInStr)
iDoc.childNodes[0].childNodes[0].data # it gives u'BEGIN:VCALENDAR
\nEND:VCALENDAR\n'
according to http://www.w3.org/TR/REC-xml/#sec-line-ends
it looks normal, but another part of the documentation says that "only
the CDEnd string is recognized as markup": http://www.w3.org/TR/REC-xml/#sec-cdata-sect
so parser must (IMHO) give the value of CDATA-section "as is" (neither
both of parts of the document do not contradicts to each other).
How to get the value of CDATA-section with preserved all symbols
within? (perhaps use another parser - which one?)
Many thanks for any help.