G
Greg Aumann
I am trying to write some python code for a library that reads an
XML-like language from a file into elementtree data structures. Then I
want to be able to read and/or modify the structure and then be able to
write it out either as XML or in the original format. I really want the
api for the XML-like language to be the same as the elementtree api to
reduce confusion, ease of learning etc.
In reading the elementtree documentation I found the
ElementTree.TreeBuilder class which it says can be used to create
parsers for XML-like languages. So I wrote the code below. The code is
working but I am not sure that this is really the intended way to use
the ElementTree.TreeBuilder class.
Essentially I was trying to implement the following advice from Frederik
Lundh (Wed, Sep 8 2004 12:54 am):
but in another post he wrote (Wed, May 21 2003 2:56 am):
This second one makes me think I should have implemented a parser class
using Treebuilder. Also when I used return builder.close() in the code
below it didn't return an ElementTree structure but an _ElementInterface.
So my question is really about how I should structure the code so that
it is as similar to use this XML format as to use XML itself in
elementtree.
from elementtree import ElementTree
from nltk_lite.corpora.shoebox import ShoeboxFile
class Settings(ShoeboxFile):
def __init__(self):
super(Settings, self).__init__()
def parse(self, encoding=None):
builder = ElementTree.TreeBuilder()
for mkr, value in self.fields(encoding, unwrap=False):
block=mkr[0]
if block in ("+", "-"):
mkr=mkr[1:]
else:
block=None
if block == "+":
builder.start(mkr, {})
builder.data(value)
elif block == '-':
builder.end(mkr)
else:
builder.start(mkr, {})
builder.data(value)
builder.end(mkr)
return ElementTree.ElementTree(builder.close())
XML-like language from a file into elementtree data structures. Then I
want to be able to read and/or modify the structure and then be able to
write it out either as XML or in the original format. I really want the
api for the XML-like language to be the same as the elementtree api to
reduce confusion, ease of learning etc.
In reading the elementtree documentation I found the
ElementTree.TreeBuilder class which it says can be used to create
parsers for XML-like languages. So I wrote the code below. The code is
working but I am not sure that this is really the intended way to use
the ElementTree.TreeBuilder class.
Essentially I was trying to implement the following advice from Frederik
Lundh (Wed, Sep 8 2004 12:54 am):
> by the way, it's trivial to build trees from arbitrary SAX-style sources.
> just create an instance of the ElementTree.TreeBuilder class, and call
> the "start", "end", and "data" methods as appropriate.
>
> builder = ElementTree.TreeBuilder()
> builder.start("tag", {})
> builder.data("text")
> builder.end("tag")
> elem = builder.close()
but in another post he wrote (Wed, May 21 2003 2:56 am):
> usage:
>
> from elementtree import ElementTree, HTMLTreeBuilder
>
> # file is either a filename or an open stream
> tree = ElementTree.parse(file, parser=HTMLTreeBuilder.TreeBuilder())
> root = tree.getroot()
>
> or
>
> from elementtree import HTMLTreeBuilder
>
> parser = HTMLTreeBuilder.TreeBuilder()
> parser.feed(data)
> root = parser.close()
This second one makes me think I should have implemented a parser class
using Treebuilder. Also when I used return builder.close() in the code
below it didn't return an ElementTree structure but an _ElementInterface.
So my question is really about how I should structure the code so that
it is as similar to use this XML format as to use XML itself in
elementtree.
from elementtree import ElementTree
from nltk_lite.corpora.shoebox import ShoeboxFile
class Settings(ShoeboxFile):
def __init__(self):
super(Settings, self).__init__()
def parse(self, encoding=None):
builder = ElementTree.TreeBuilder()
for mkr, value in self.fields(encoding, unwrap=False):
block=mkr[0]
if block in ("+", "-"):
mkr=mkr[1:]
else:
block=None
if block == "+":
builder.start(mkr, {})
builder.data(value)
elif block == '-':
builder.end(mkr)
else:
builder.start(mkr, {})
builder.data(value)
builder.end(mkr)
return ElementTree.ElementTree(builder.close())