Simple XML-to-Python conversion

G

gaudetteje

I've been searching high and low for a way to simply convert a small
XML configuration file to Python data structures. I came across gnosis
XML tools, but need a built-in option for doing something similar.

My knowledge of DOM and anything beyond simple XML structures is
rudimentary at best. Is there a simple way to use SAX handlers to pull
attributes and elements into Python lists/dictionaries?
 
G

gaudetteje

Amara does indeed make it effortless to transform an XML document into
a Python structure. Unfortunately this suggestion requires the 3rd
party software, Amara, _and_ a 4Suite installation according to the
website.

The reason I can't expect users to have 3rd party tools is because this
tool will be used in a secure lab environment without Internet access.
Asking the admins to place these software packages on the network for
users to install is like asking GW to say something semi-intelligent.
 
T

Tim Jarman

Amara does indeed make it effortless to transform an XML document into
a Python structure. Unfortunately this suggestion requires the 3rd
party software, Amara, _and_ a 4Suite installation according to the
website.

The reason I can't expect users to have 3rd party tools is because this
tool will be used in a secure lab environment without Internet access.
Asking the admins to place these software packages on the network for
users to install is like asking GW to say something semi-intelligent.

Why not use distutils to install any additional packages you need? That's
what it's there for. Presumably you're going to need to package this thing
for distribution anyway.
 
G

gaudetteje

A good question... Here's a followup question:

Does third-party software compile as well as the built-in modules when
using distutils and the py2exe extension module?
 
T

Thomas Guettler

Am Wed, 16 Mar 2005 14:19:49 -0800 schrieb (e-mail address removed):
I've been searching high and low for a way to simply convert a small
XML configuration file to Python data structures. I came across gnosis
XML tools, but need a built-in option for doing something similar.

My knowledge of DOM and anything beyond simple XML structures is
rudimentary at best. Is there a simple way to use SAX handlers to pull
attributes and elements into Python lists/dictionaries?

Hi,

Just write a script which uses xml.sax. It is not difficult.
You get an event for every start-tag and for every end-tag.
The class which handles the events can build the structure
if your XML file.

The online version of the python cookbook has some
python and SAX examples,

Thomas
 
F

Fredrik Lundh

I've been searching high and low for a way to simply convert a small
XML configuration file to Python data structures. I came across gnosis
XML tools, but need a built-in option for doing something similar.

My knowledge of DOM and anything beyond simple XML structures is
rudimentary at best. Is there a simple way to use SAX handlers to pull
attributes and elements into Python lists/dictionaries?

tools like ElementTree and xmltramp are very easy to use. ElementTree builds
a tree of Element objects which behave like lists (of subelements) and contain
dictionaries (of attributes). xmltramp builds a tree where both subelements and
attributes are mapped to object attributes.

http://www.aaronsw.com/2002/xmltramp/
http://www.effbot.org/zone/element-index.htm

if you want to ship any of them with your application, you only need a single
module (xmltramp.py and ElementTree.py, respectively).

API summaries:

http://reagle.org/joseph/blog/technology/python/elementtree-model
http://reagle.org/joseph/blog/technology/python/xmltramp-model

(xmltramp might be a bit easier to use for simple cases, elementtree is a bit more
flexible and a bit more efficient, especially if you're using the C implementation)

</F>
 
G

gaudetteje

Since I've exhausted every option except for Amara, I've decided to
give it a try. However, this will only work if I can compile Amara and
4suite along with my application. I doubt 4suite will be able to be
compiled, but I'll try it anyway.

If I weren't set on using XML (I know, not every application requires
it and it's abused), I would probably go with a simpler format like an
INI file or YAML. The requirement that leads me to use XML for a
simple config file is that the rest of this project already utilizes
XML and another developer must integrate his NetBeans GUI with my app
and config file. Not to mention that this must be across Linux and
Windows platforms. To make this easier on everyone else, I'm using a
simple XML schema that I know everyone can write to. As far as I know,
YAML isn't easily written to from a Java class and an INI file isn't
very popular on the Linux side.
 
F

Fredrik Lundh

Since I've exhausted every option except for Amara, I've decided to
give it a try.

why didn't xmltramp or elementtree work for your application? they're used
all over the place, in all sorts of applications, so it would be interesting to know
what's so special about your app...

</F>
 
G

gaudetteje

I've tried xmltramp and element tree, but these tools aren't what I had
in mind. I've come to the realization that it's not the tools that are
lacking. In fact, I'm a big fan of ElementTree now, but would only use
it for large parsing tasks. Instead, I think the problem is either
inherent in the XML standard, or I'm missing something conceptually.

Let me elaborate. Most of these tools I've experimented with parse a
document rather easily into a structure that I can traverse one element
at a time. Each level of a node/tree has 4 basic pieces:
1) the tag,
2) one or more attributes,
3) encapsulated data, and
4) children nodes/trees
From what I understand, this is how XML was standardized, in a sort of
hierarchical structure of infinite possibilities. The problem I'm
having with these structures is that I need to actively search through
each level for the item I want. All I really want to do is access one
or more elements at the same time and know where they are without
searching.

What I'm looking to use are basic structures such as:
root.path
root.input.method
root.input.filename
root.output.filename
to name a few. Since these are essentially unique constants that will
only be used once or twice, I want to be able to place them in a
function argument, or concatenate several of them, such as root.path
and root.input.filename to create a new string. I'm finding it rather
impossible to do such things unless I basically create my own structure
from the tree structure that I get when I parse the XML document.

If I used a simple INI file or CSV file, I would simply have to parse
my file once and match the name with the value. Why is it necessary to
parse a document once and then re-parse your information into a format
that you can use. This seems absurd to me. Any thoughts on this? Do
I even have the correct understanding of how this is done?
 
S

Swaroop C H

hierarchical structure of infinite possibilities. The problem I'm
having with these structures is that I need to actively search through
each level for the item I want. All I really want to do is access one
or more elements at the same time and know where they are without
searching.

What I'm looking to use are basic structures such as:
root.path
root.input.method
root.input.filename
root.output.filename

You should be using XPath (4suite has it) to get the parts that you want.
A really quick intro is at
http://simon.incutio.com/archive/2003/10/21/xpathRocks

Regards,
 
D

Diez B. Roggisch

If I used a simple INI file or CSV file, I would simply have to parse
my file once and match the name with the value. Why is it necessary to
parse a document once and then re-parse your information into a format
that you can use. This seems absurd to me. Any thoughts on this? Do
I even have the correct understanding of how this is done?

The problem here is that xml allows for modelling data-structures with
parent-child relationships and preserve the order of elements. So it's much
more capable than ini and csv - but that comes for the cost of more
elaborated and somewhat complicated apis. A configuration file is only
_one_ possible use case.

So if you already found that ini-Files suit your needs - why don't you use
them? There is a ConfigfileParser in python to parse these, and it gives
you easy access to the defined sections and fields. And don't bother if
ini-Files are popular on linux or not - on linux a great deal of different
formats is popular, and nobody seems to care too much. E.g. samba uses
ini-style config, too.
 
I

Ivan Voras

I've been searching high and low for a way to simply convert a small
XML configuration file to Python data structures. I came across gnosis
XML tools, but need a built-in option for doing something similar.

I too needed such thing, and made this simple parser:
http://ivoras.sharanet.org/projects/xmldict.py.gz

It's really quick and dirty. It doesn't even use standard parsers such
as dom or sax, but improvises its own. It's also very likely to fail in
mysterious ways when it encounters invalid XML, but for quick and dirty
jobs, it's very nice and easy. See the bottom of the file for some quick
examples.
 
G

gaudetteje

One reason I chose not to use ConfigParser module is that I also have a
similar config file for a MATLAB compiled program to run along with my
Python script. XML would eliminate the need to use two different style
configuration files.

Another reason is that the programmer who is writing the GUI to
interface with my Python/Matlab programs must be able to easily
read/modify/write these configuration files. Since he is already using
XML for other purposes in the Netbeans GUI, this was a logical choice.
 
G

gaudetteje

Thanks Lutz!

I should have looked into Amara's binderytools module earlier. This is
just the type of tool I was looking for. When I tried testing its
compatibility with py2exe, I was _almost_ able to compile... Does
anyone know where the following libraries exist? I thought Amara would
have these included, but it looks like I need to install another
module.

<SNIP>previous compiling stuff</SNIP>
*** copy dlls ***
copying C:\WINDOWS\system32\python23.dll -> E:\src\python\xml\dist
copying c:\python23\w9xpopen.exe -> E:\src\python\xml\dist
setting sys.winver for 'E:\src\python\xml\dist\python23.dll' to
'py2exe'
copying c:\python23\lib\site-packages\py2exe\run.exe ->
E:\src\python\xml\dist\amarabind.exe
The following modules appear to be missing
['FtMiniDom.GetAllNs', 'FtMiniDom.SeekNss', 'FtMiniDom.implementation',
'FtMiniD
om.nonvalParse', 'FtMiniDom.valParse', 'XmlStripc',
'xml.parsers.xmlproc']
E:\src\python\xml>

These Ft* and Xml* modules should have been installed with Amara,
correct? After running the setup.py script, it generates my .EXE file
and outputs the correct text, but with the addition of a prepended
Traceback dump:

E:\src\python\xml\dist>amarabind.exe
Traceback (most recent call last):
File "Ft\Xml\Catalog.pyc", line 320, in ?
File "Ft\Xml\Catalog.pyc", line 68, in __init__
File "Ft\Xml\Catalog.pyc", line 85, in parse
File "Ft\Xml\Catalog.pyc", line 131, in parseXmlCat
File "xml\sax\expatreader.pyc", line 107, in parse
File "xml\sax\xmlreader.pyc", line 123, in parse
File "xml\sax\expatreader.pyc", line 207, in feed
File "xml\sax\expatreader.pyc", line 379, in external_entity_ref
File "xml\sax\saxutils.pyc", line 292, in prepare_input_source
File "urllib.pyc", line 76, in urlopen
File "urllib.pyc", line 154, in open
File "urllib.pyc", line 937, in toBytes
LookupError: unknown encoding: ASCII
Default catalog C:\Python23\Share\4Suite\default.cat not found
Anise-Almond Biscotti
margarine :: 4 tablespoons
sugar :: 3/4 cup
eggs :: 4
all-purpose flour :: 21/2 cups
crushed anise seeds :: 2 teaspoons
baking powder :: 1 1/2 teaspoons
salt :: 1/4 teaspoon
whole blanched almonds :: 1/3 cup

Any additional help is greatly appreciated!

Jay
 
U

Uche Ogbuji

Thanks Lutz!

I should have looked into Amara's binderytools module earlier. This is
just the type of tool I was looking for. When I tried testing its
compatibility with py2exe, I was _almost_ able to compile... Does
anyone know where the following libraries exist? I thought Amara would
have these included, but it looks like I need to install another
module.

Were currently on the 4Suite mailing list chasing down all the magic
required for py2exe. I'm largely a Windows illiterate, but this looks
like what I remember:

http://lists.fourthought.com/pipermail/4suite/2005-March/013450.html

I do want to be sure Amara can be packaged with py2exe, so please let me
know if this helps. You might want to consider continuing the
discussion on the 4SUite list (which I use for Amara discussion as
well). I follow that list far more diligently than c.l.py.


--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
Use CSS to display XML, part 2 - http://www-128.ibm.com/developerworks/edu/x-dw-x-xmlcss2-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Gems from the Mines: 2002 to 2003 - http://www.xml.com/pub/a/2005/03/02/pyxml.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html
 
U

Uche Ogbuji

Since I've exhausted every option except for Amara, I've decided to
give it a try. However, this will only work if I can compile Amara and
4suite along with my application. I doubt 4suite will be able to be
compiled, but I'll try it anyway.

Actually, as I mentioned in my last message, we do have some success
reports re: 4Suite + py2exe. See the March archives of the 4Suite list.
I think it took some work from those of the 4Suite developers who are
Windows-savvy, it did the job.


--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
Use CSS to display XML, part 2 - http://www-128.ibm.com/developerworks/edu/x-dw-x-xmlcss2-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Gems from the Mines: 2002 to 2003 - http://www.xml.com/pub/a/2005/03/02/pyxml.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top