ElementTree and xsi to xmlns conversion?

M

Matthew Thorley

Why does ElementTree.parse convert my xsi to an xmlns?

When I do this
from elementtree import ElementTree

# Sample xml
mgac ="""
<mgac xmlns="http://www.chpc.utah.edu/~baites/mgacML"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.chpc.utah.edu/~baites/mgacML
http://www.chpc.utah.edu/~baites/mgacML/mgac.xsd"><cluster
name="Si4H"></cluster></mgac>
"""

xml = ElementTree.fromstring(mgac)
ElementTree.tostring(xml)

I get this
'<ns0:mgac ns1:schemaLocation="http://www.chpc.utah.edu/~baites/mgacML
http://www.chpc.utah.edu/~baites/mgacML/mgac.xsd"
xmlns:ns0="http://www.chpc.utah.edu/~baites/mgacML"
xmlns:ns1="http://www.w3.org/2001/XMLSchema-instance"><ns0:cluster
name="Si4H" /></ns0:mgac>'


The xsi is gone and has been replaced by a new xmlns, which is also NOT
inherited by the child elements.

ElementTree.tostring(xml.getchildren()[0])

'<ns0:cluster name="Si4H"
xmlns:ns0="http://www.chpc.utah.edu/~baites/mgacML" />'

If some one could please explain where I'm off I'd really appreciate it.
I need to use xsi: to validate the document, and I'm not sure how to
pass it on to the children when I reformat the doc.

Thanks
-Matthew
 
F

Fredrik Lundh

Matthew said:
Why does ElementTree.parse convert my xsi to an xmlns?

because it is a namespace prefix, perhaps?
When I do this
from elementtree import ElementTree

# Sample xml
mgac ="""
<mgac xmlns="http://www.chpc.utah.edu/~baites/mgacML"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.chpc.utah.edu/~baites/mgacML
http://www.chpc.utah.edu/~baites/mgacML/mgac.xsd"><cluster
name="Si4H"></cluster></mgac>
"""

xml = ElementTree.fromstring(mgac)
ElementTree.tostring(xml)

I get this
'<ns0:mgac ns1:schemaLocation="http://www.chpc.utah.edu/~baites/mgacML
http://www.chpc.utah.edu/~baites/mgacML/mgac.xsd"
xmlns:ns0="http://www.chpc.utah.edu/~baites/mgacML"
xmlns:ns1="http://www.w3.org/2001/XMLSchema-instance"><ns0:cluster
name="Si4H" /></ns0:mgac>'

The xsi is gone and has been replaced by a new xmlns, which is also NOT
inherited by the child elements.

the xsi is a namespace prefix, which maps to a namespace URI. the child element
doesn't use that namespace, so there's no need to add a namespace declaration.
ElementTree.tostring(xml.getchildren()[0])

'<ns0:cluster name="Si4H"
xmlns:ns0="http://www.chpc.utah.edu/~baites/mgacML" />'

If some one could please explain where I'm off I'd really appreciate it.
I need to use xsi: to validate the document

are you sure? the prefix shouldn't matter; it's the namespace URI that's important.
if you're writing code that depends on the namespace prefix rather than the name-
space URI, you're not using namespaces correctly. when it comes to namespaces,
elementtree forces you to do things the right way:

http://www.jclark.com/xml/xmlns.htm

(unfortunately, the XML schema authors didn't understand namespaces so they
messed things up:
http://www.w3.org/2001/tag/doc/qnameids-2002-04-30
to work around this, see oren's message about how to control the namespace/prefix
mapping. in worst case, you can manually insert xsi:-attributes in the tree, and rely on
the fact that the default writer only modifies universal names)

</F>
 
H

Harry George

[snip]
are you sure? the prefix shouldn't matter; it's the namespace URI that's important.
if you're writing code that depends on the namespace prefix rather than the name-
space URI, you're not using namespaces correctly. when it comes to namespaces,
elementtree forces you to do things the right way:

http://www.jclark.com/xml/xmlns.htm

(unfortunately, the XML schema authors didn't understand namespaces so they
messed things up:
http://www.w3.org/2001/tag/doc/qnameids-2002-04-30
to work around this, see oren's message about how to control the namespace/prefix
mapping. in worst case, you can manually insert xsi:-attributes in the tree, and rely on
the fact that the default writer only modifies universal names)

</F>

First, thanks for ElementTree and cElementTree. Second, I've read the
docs and see a lot of examples for building trees, but not a lot for
traversing parsed trees. Questions:

1. Is there a good idiom for namespaces? I'm currently doing things like:

UML='{href://org.omg/UML/1.3}'
.....
packages=ns2.findall(UML+'Package')

2. Is there a similar idiom which works for Paths? I've tried:

packages=pkg1.findall(UML+'Namespace.ownedElement/'+UML+'Package')

but haven't found the right combination, so I do step-at-a-time descent.
 
M

Matthew Thorley

Thanks for the reply I am understanding it better now. Please forgive my
ignorance. So the xsi is just an arbitrary name space prefix, I get that
now. And it make sense to me why it gets converted to an xmlns.

What I really need to know is why it is not inherited by the child
elements? From what I an told, I need the second namespace, so that I
can point to the schema, so that I can validate the document.

Is that the wrong way to link to the schema? Can I force both namespaces
to be inherited by the child elements?

Thanks for all the help
-Matthew
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,701
Latest member
XavierQ83

Latest Threads

Top