replacing xml elements with other elements using lxml

U

Ultrus

Hello,
I'm attempting to generate a random story using xml as the document,
and lxml as the parser. I want the document to be simplified before
processing it further, and am very close to accomplishing my goal.
Below is what I have so far. Any ideas on how to move forward?

The goal:
read and edit xml file, replacing random elements with randomly picked
content from within

Completed:
[x] read xml
[x] access first random tag
[x] pick random content within random item
[o] need to replace <random> tag with picked contents

xml sample:
<contents>Here is some content.</contents>
<random>
<item><contents>Here is some random content.</contents></item>
<item><contents>Here is some more random content.</contents></item>
</random>
<contents>Here is some content.</contents>

Python code:
from lxml import etree
from StringIO import StringIO
import random

theXml = "<contents>Here is some content.</
contents><random><item><contents>Here is some random content.</
contents></item><item><contents>Here is some more random content.</
contents></item></random><contents>Here is some content.</contents>"

f = StringIO(theXml)
tree = etree.parse(f)
r = tree.xpath('//random')

if len(r) > 0:
randInt = random.randInt(0,(len(r[0]) - 1))
randContents = r[0][randInt][0]
#replace parent random tag with picked content here

now that I have the contents tag randomly chosen, how do I delete the
parent <random> tag, and replace it to look like this:

final xml sample (goal):
<contents>Here is some content.</contents>
<contents>Here is some random content.</contents>
<contents>Here is some content.</contents>

Any idea on how to do this? So close! Thanks for the help in
advance. :)
 
S

Stefan Behnel

Ultrus said:
I'm attempting to generate a random story using xml as the document,
and lxml as the parser. I want the document to be simplified before
processing it further, and am very close to accomplishing my goal.
Below is what I have so far. Any ideas on how to move forward?

The goal:
read and edit xml file, replacing random elements with randomly picked
content from within

Completed:
[x] read xml
[x] access first random tag
[x] pick random content within random item
[o] need to replace <random> tag with picked contents

xml sample:
<contents>Here is some content.</contents>
<random>
<item><contents>Here is some random content.</contents></item>
<item><contents>Here is some more random content.</contents></item>
</random>
<contents>Here is some content.</contents>

Hmm, this is not well-formed XML, so I assume you stripped the example. The
root element is missing.

Python code:
from lxml import etree
from StringIO import StringIO
import random

theXml = "<contents>Here is some content.</
contents><random><item><contents>Here is some random content.</
contents></item><item><contents>Here is some more random content.</
contents></item></random><contents>Here is some content.</contents>"

f = StringIO(theXml)
tree = etree.parse(f)

^^^^^
This would raise an exception if the above really *was* your input.

r = tree.xpath('//random')

if len(r) > 0:
randInt = random.randInt(0,(len(r[0]) - 1))
randContents = r[0][randInt][0]
#replace parent random tag with picked content here

now that I have the contents tag randomly chosen, how do I delete the
parent <random> tag, and replace it to look like this:

final xml sample (goal):
<contents>Here is some content.</contents>
<contents>Here is some random content.</contents>
<contents>Here is some content.</contents>

what about:

r.getparent().replace(r, random.choice(r))

?

Stefan
 
U

Ultrus

Stefan,
I'm honored by your response.

You are correct about the bad xml. I attempted to shorten the xml for
this example as there are other tags unrelated to this issue in the
mix. Based on your feedback, I was able to make following fully
functional code using some different techniques:

from lxml import etree
from StringIO import StringIO
import random

sourceXml = "\
<theroot>\
<contents>Stefan's fortune cookie:</contents>\
<random>\
<item>\
<random>\
<item>\
<contents>You will always know love.</contents>\
</item>\
<item>\
<contents>You will spend it all in one place.</contents>\
</item>\
</random>\
</item>\
<item>\
<contents>Your life comes with a lifetime warrenty.</contents>\
</item>\
</random>\
<contents>The end.</contents>\
</theroot>"

parser = etree.XMLParser(ns_clean=True, recover=True,
remove_blank_text=True, remove_comments=True)
tree = etree.parse(StringIO(sourceXml), parser)
xml = tree.getroot()

def reduceRandoms(xml):
for elem in xml:
if elem.tag == "random":
elem.getparent().replace(elem, random.choice(elem)[0])
reduceRandoms(xml)

reduceRandoms(xml)
for elem in xml:
print elem.tag, ":", elem.text




One challenge that I face now is that I can only replace a parent
element with a single element. This isn't a problem if an <item>
element only has 1 <contents> element, or just 1 <random> element
(this works above). However, if <item> elements have more than one
child element such as a <contents> element, followed by a <random>
element (like children of <theroot>), only the first element is used.

Any thoughts on how to replace+append after the replaced element, or
clear+append multiple elements to the cleared position?

Thanks again :)
 
U

Ultrus

Ah! I figured it out. I forgot that the tree is treated like a list.
The solution was to replace the <random> element with the first <item>
child, then use Python's insert(i,x) function to insert elements after
the first one.

lxml rocks!
 
S

Stefan Behnel

Ultrus said:
Ah! I figured it out. I forgot that the tree is treated like a list.
The solution was to replace the <random> element with the first <item>
child, then use Python's insert(i,x) function to insert elements after
the first one.

You could also use slicing, something like:

parent[2:3] = child[1:5]

should work.

lxml rocks!

I know, but it feels good to read it once in a while. :)

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top