xml.dom.minidom weirdness: bug?

JYA · Apr 30, 2008

Hi.

I was writing an xmltv parser using python when I faced some weirdness
that I couldn't explain.

What I'm doing, is read an xml file, create another dom object and copy
the element from one to the other.

At no time do I ever modify the original dom object, yet it gets modified.

Unless I missed something, it sounds like a bug to me.

the xml file is simply:
<?xml version="1.0" encoding="utf-8"?>
<tv><channel id="id1"><display-name lang="en">full
name</display-name></channel></tv>

which I store under the name test.xmltv

Here is the code, I've removed everything that isn't applicable to my
description. can't make it any simpler I'm afraid:

from xml.dom.minidom import Document
import xml.dom.minidom

def adjusttimezone(docxml, timezone):
doc = Document()

# Create the <tv> base element
tv_xml = doc.createElement("tv")
doc.appendChild(tv_xml)

#Create the channel list
channellist = docxml.getElementsByTagName('channel')

for x in channellist:
#Copy the original attributes
elem = doc.createElement("channel")
for y in x.attributes.keys():
name = x.attributes[y].name
value = x.attributes[y].value
elem.setAttribute(name,value)
for y in x.getElementsByTagName('display-name'):
elem.appendChild(y)
tv_xml.appendChild(elem)

return doc

if __name__ == '__main__':
handle = open('test.xmltv','r')
docxml = xml.dom.minidom.parse(handle)
print 'step1'
print docxml.toprettyxml(indent=" ",encoding="utf-8")
doc = adjusttimezone(docxml, 1000)
print 'step2'
print docxml.toprettyxml(indent=" ",encoding="utf-8")

Now at "step 1" I will display the content of the dom object, quite
natually it shows:
<?xml version="1.0" encoding="utf-8"?>
<tv>
<channel id="id1">
<display-name lang="en">
full name
</display-name>
</channel>
</tv>

After a call to adjusttimezone, "step 2" however will show:
<?xml version="1.0" encoding="utf-8"?>
<tv>
<channel id="id1"/>
</tv>

That's it !

You'll note that at no time do I modify the content of docxml, yet it
gets modified.

The weirdness disappear if I change the line
channellist = docxml.getElementsByTagName('channel')
to
channellist = copy.deepcopy(docxml.getElementsByTagName('channel'))

However, my understanding is that it shouldn't be necessary.

Any thoughts on this weirdness ?

Thanks
Jean-Yves

Gabriel Genellina · Apr 30, 2008

En Tue said:
What I'm doing, is read an xml file, create another dom object and copy
the element from one to the other.

At no time do I ever modify the original dom object, yet it gets
modified.

for y in x.getElementsByTagName('display-name'):
elem.appendChild(y)
tv_xml.appendChild(elem)

You'll note that at no time do I modify the content of docxml, yet it
gets modified.

The weirdness disappear if I change the line
channellist = docxml.getElementsByTagName('channel')
to
channellist = copy.deepcopy(docxml.getElementsByTagName('channel'))

However, my understanding is that it shouldn't be necessary.

I think that any element can have only a single parent. If you get an
element from one document and insert it onto another document, it gets
removed from the first.

Marc Christiansen · Apr 30, 2008

JYA said:
for y in x.getElementsByTagName('display-name'):
elem.appendChild(y)

Like Gabriel wrote, nodes can only have one parent. Use
elem.appendChild(y.cloneNode(True))
instead. Or y.cloneNode(False), if you want a shallow copy (i.e. without
any of the children, e.g. text content).

Marc

xml.dom.minidom character encoding	6	Apr 21, 2010
XML parsing ExpatError with xml.dom.minidom at line 1, column 0	2	Feb 13, 2014
PHP RSS Feed Aggregator changing to todays date everytime feed is aggregated	1	Jan 11, 2022
Parsing unicode (devanagari) text with xml.dom.minidom	6	Mar 8, 2009
xml.dom.minidom losing the XML document type attribute	7	Jun 10, 2009
An unknown bug doesn't allow the quotes app to work. What's the issue?	3	Apr 23, 2023
xml.dom.minidom childnodes	2	Jan 18, 2004
xml.dom.minidom - bug ? future ?	1	Sep 4, 2003

xml.dom.minidom weirdness: bug?

JYA

Gabriel Genellina

Marc Christiansen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads