parsing OPML

Chris · Apr 6, 2004

I am trying to write a script that will parse my Bloglines OPML export
(see snippet below) and output an HTML blogroll.

I can get at all the actual blog entries with the code below. But the
problem is that I would like to get at the "folder" level ("Test and
Demo" and "Unfiled" below) which just has a name and no other
attributes so that I can group the entries in their "categories"
something like this (http://rpc.bloglines.com/blogroll?html=1&id=chrislott).
My code only lists the entries themselves...

********************************blogs.opml**
<opml version="1.0">
<head>
<title>Bloglines Subscriptions</title>
<dateCreated>Sun, 4 Apr 2004 20:15:17 GMT</dateCreated>
<ownerEmail>[email protected]</ownerEmail>
</head>
<body>
<outline title="Subscriptions">
<outline title="Test and Demo">
<outline title="del.icio.us/imao/Learning"
htmlUrl="http://del.icio.us/imao/Learning" type="rss"
xmlUrl="http://del.icio.us/rss/imao/Learning"/>

<outline title="Fairbanks, Alaska Weather"
htmlUrl="http://www.rssweather.com/hw3.php?zipcode=99701" type="rss"
xmlUrl="http://rssweather.com/rss.php?hwvUT...ountry=us&county=02090&zone=AKZ222&alt=rss20a"/>
</outline>

<outline title="Unfiled">

<outline title="Boxes and Arrows"
htmlUrl="http://www.boxesandarrows.com/" type="rss"
xmlUrl="http://www.boxesandarrows.com/index.xml"/>
<outline title="CBB Plagiarism Project -"
htmlUrl="http://leeds.bates.edu/cbb/" type="rss"
xmlUrl="http://leeds.bates.edu/cbb/module.php?mod=node&op=feed"/>

</outline>

********************************************my script**

from xml.sax import make_parser
from xml.sax.handler import ContentHandler

class OPMLHandler(ContentHandler):

def startElement(self, name, attrs):
if name == 'outline':
self.title = attrs.get('title', '')
self.url = attrs.get('xmlUrl', '')

def endElement(self, name):
if name == 'outline':
print self.level, ':', self.title, '-', self.url

parser = make_parser()
curHandler = OPMLHandler()
parser.setContentHandler(curHandler)
parser.parse(open('blogs.opml'))

Richard Morse · Apr 7, 2004

I am trying to write a script that will parse my Bloglines OPML export
(see snippet below) and output an HTML blogroll. [snip]
from xml.sax import make_parser
from xml.sax.handler import ContentHandler

class OPMLHandler(ContentHandler):

def startElement(self, name, attrs):
if name == 'outline':
self.title = attrs.get('title', '')
self.url = attrs.get('xmlUrl', '')

def endElement(self, name):
if name == 'outline':
print self.level, ':', self.title, '-', self.url

parser = make_parser()
curHandler = OPMLHandler()
parser.setContentHandler(curHandler)
parser.parse(open('blogs.opml'))

It looks to me like this is Python.

This is a Perl newsgroup.

Perhaps you meant to post to a Python newsgroup?

Ricky

Pimp my Ruby	0	Jan 19, 2006
criticize my code.. please?	3	Oct 16, 2004
Splitting SAX results	6	Jun 7, 2007
Why does is this highlighting in Firefox	18	Sep 24, 2009
[ANN] Rails 0.8: Just shy of 100 additions, changes, tweaks, and fixes!	0	Oct 25, 2004

parsing OPML

Chris

Richard Morse

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads