XML parsing with python

I

inder

Hi All,

I am new to xml . I need to parse the xml file . After reading and
browsing on the web , I could get much help .

I guess SAX would be better suited for my requirement .

Could some juct provide me a sample python code so that I can execute
it and see how the parsing actually happens .

Lets say my xml file -
<?xml version="1.0"?>
<library>
<book id="ISBN001">
<title>I,Robot</title>
<pages>100</pages>
<author>Isaac Asimov</author>
</book>
<book id="ISBN001" damaged="true">
<title>Blade Runner</title>
<pages>400</pages>
<author>Philip K. Dick</author>
</book>
</category>
<category code="Boring" room="2">
<book id="ISBN003">
<title>Lord Of The Rings</title>
<pages>20000</pages>
<author>Tolkien</author>
</book>
<book id="ISBN004" damaged="true">
<title>XML-Schema Specification</title>
<pages>5000</pages>
<author>W3C</author>
</book>
</category>
<category code="Fantasy">
<book id="ISBN005" damaged="true">
<title>Aladin</title>
<pages>150</pages>
<author>Don't know</author>
</book>
</category>
</library>


--------------------------

I need the output to be - (elements containing 'title' )

I,Robot
Blade Runner
Lord Of The Rings
XML-Schema Specification
Aladin


Your responses are greatly appreciated .


Thanks in advace
 
S

Stefan Behnel

inder said:
I am new to xml . I need to parse the xml file . After reading and
browsing on the web , I could get much help .

I guess SAX would be better suited for my requirement .

That's a common misconception.

Could some juct provide me a sample python code so that I can execute
it and see how the parsing actually happens .

Lets say my xml file -
<?xml version="1.0"?>
<library>
<category code="SciFi" room="1"> <!--if you want to test invalid
document against schema you can just cut the mandatory id attribute --
<book id="ISBN001">
<title>I,Robot</title>
<pages>100</pages>
<author>Isaac Asimov</author>
</book>
<book id="ISBN001" damaged="true">
<title>Blade Runner</title>
<pages>400</pages>
<author>Philip K. Dick</author>
</book>
</category>
<category code="Boring" room="2">
<book id="ISBN003">
<title>Lord Of The Rings</title>
<pages>20000</pages>
<author>Tolkien</author>
</book>
<book id="ISBN004" damaged="true">
<title>XML-Schema Specification</title>
<pages>5000</pages>
<author>W3C</author>
</book>
</category>
<category code="Fantasy">
<book id="ISBN005" damaged="true">
<title>Aladin</title>
<pages>150</pages>
<author>Don't know</author>
</book>
</category>
</library>


--------------------------

I need the output to be - (elements containing 'title' )

I,Robot
Blade Runner
Lord Of The Rings
XML-Schema Specification
Aladin

Use the iterparse() function of the xml.etree.ElementTree package.

http://effbot.org/zone/element-iterparse.htm
http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,222
Members
46,810
Latest member
Kassie0918

Latest Threads

Top