J
John Carlyle-Clarke
Hi.
I'm new to Python and trying to use it to solve a specific problem. I
have an XML file in which I need to locate a specific text node and
replace the contents with some other text. The text in question is
actually about 70k of base64 encoded data.
I wrote some code that works on my Linux box using xml.dom.minidom, but
it will not run on the windows box that I really need it on. Python
2.5.1 on both.
On the windows machine, it's a clean install of the Python .msi from
python.org. The linux box is Ubuntu 7.10, which has some Python XML
packages installed which can't easily be removed (namely python-libxml2
and python-xml).
I have boiled the code down to its simplest form which shows the problem:-
import xml.dom.minidom
import sys
input_file = sys.argv[1];
output_file = sys.argv[2];
doc = xml.dom.minidom.parse(input_file)
file = open(output_file, "w")
doc.writexml(file)
The error is:-
$ python test2.py input2.xml output.xml
Traceback (most recent call last):
File "test2.py", line 9, in <module>
doc.writexml(file)
File "c:\Python25\lib\xml\dom\minidom.py", line 1744, in writexml
node.writexml(writer, indent, addindent, newl)
File "c:\Python25\lib\xml\dom\minidom.py", line 814, in writexml
node.writexml(writer,indent+addindent,addindent,newl)
File "c:\Python25\lib\xml\dom\minidom.py", line 809, in writexml
_write_data(writer, attrs[a_name].value)
File "c:\Python25\lib\xml\dom\minidom.py", line 299, in _write_data
data = data.replace("&", "&").replace("<", "<")
AttributeError: 'NoneType' object has no attribute 'replace'
As I said, this code runs fine on the Ubuntu box. If I could work out
why the code runs on this box, that would help because then I call set
up the windows box the same way.
The input file contains an <xsd:schema> block which is what actually
causes the problem. If you remove that node and subnodes, it works
fine. For a while at least, you can view the input file at
http://rafb.net/p/5R1JlW12.html
Someone suggested that I should try xml.etree.ElementTree, however
writing the same type of simple code to import and then write the file
mangles the xsd:schema stuff because ElementTree does not understand
namespaces.
By the way, is pyxml a live project or not? Should it still be used?
It's odd that if you go to http://www.python.org/ and click the link
"Using python for..." XML, it leads you to
http://pyxml.sourceforge.net/topics/
If you then follow the download links to
http://sourceforge.net/project/showfiles.php?group_id=6473 you see that
the latest file is 2004, and there are no versions for newer pythons.
It also says "PyXML is no longer maintained". Shouldn't the link be
removed from python.org?
Thanks in advance!
I'm new to Python and trying to use it to solve a specific problem. I
have an XML file in which I need to locate a specific text node and
replace the contents with some other text. The text in question is
actually about 70k of base64 encoded data.
I wrote some code that works on my Linux box using xml.dom.minidom, but
it will not run on the windows box that I really need it on. Python
2.5.1 on both.
On the windows machine, it's a clean install of the Python .msi from
python.org. The linux box is Ubuntu 7.10, which has some Python XML
packages installed which can't easily be removed (namely python-libxml2
and python-xml).
I have boiled the code down to its simplest form which shows the problem:-
import xml.dom.minidom
import sys
input_file = sys.argv[1];
output_file = sys.argv[2];
doc = xml.dom.minidom.parse(input_file)
file = open(output_file, "w")
doc.writexml(file)
The error is:-
$ python test2.py input2.xml output.xml
Traceback (most recent call last):
File "test2.py", line 9, in <module>
doc.writexml(file)
File "c:\Python25\lib\xml\dom\minidom.py", line 1744, in writexml
node.writexml(writer, indent, addindent, newl)
File "c:\Python25\lib\xml\dom\minidom.py", line 814, in writexml
node.writexml(writer,indent+addindent,addindent,newl)
File "c:\Python25\lib\xml\dom\minidom.py", line 809, in writexml
_write_data(writer, attrs[a_name].value)
File "c:\Python25\lib\xml\dom\minidom.py", line 299, in _write_data
data = data.replace("&", "&").replace("<", "<")
AttributeError: 'NoneType' object has no attribute 'replace'
As I said, this code runs fine on the Ubuntu box. If I could work out
why the code runs on this box, that would help because then I call set
up the windows box the same way.
The input file contains an <xsd:schema> block which is what actually
causes the problem. If you remove that node and subnodes, it works
fine. For a while at least, you can view the input file at
http://rafb.net/p/5R1JlW12.html
Someone suggested that I should try xml.etree.ElementTree, however
writing the same type of simple code to import and then write the file
mangles the xsd:schema stuff because ElementTree does not understand
namespaces.
By the way, is pyxml a live project or not? Should it still be used?
It's odd that if you go to http://www.python.org/ and click the link
"Using python for..." XML, it leads you to
http://pyxml.sourceforge.net/topics/
If you then follow the download links to
http://sourceforge.net/project/showfiles.php?group_id=6473 you see that
the latest file is 2004, and there are no versions for newer pythons.
It also says "PyXML is no longer maintained". Shouldn't the link be
removed from python.org?
Thanks in advance!