C
Chris
I am trying to write a script that will parse my Bloglines OPML export
(see snippet below) and output an HTML blogroll.
I can get at all the actual blog entries with the code below. But the
problem is that I would like to get at the "folder" level ("Test and
Demo" and "Unfiled" below) which just has a name and no other
attributes so that I can group the entries in their "categories"
something like this (http://rpc.bloglines.com/blogroll?html=1&id=chrislott).
My code only lists the entries themselves...
********************************blogs.opml**
<opml version="1.0">
<head>
<title>Bloglines Subscriptions</title>
<dateCreated>Sun, 4 Apr 2004 20:15:17 GMT</dateCreated>
<ownerEmail>[email protected]</ownerEmail>
</head>
<body>
<outline title="Subscriptions">
<outline title="Test and Demo">
<outline title="del.icio.us/imao/Learning"
htmlUrl="http://del.icio.us/imao/Learning" type="rss"
xmlUrl="http://del.icio.us/rss/imao/Learning"/>
<outline title="Fairbanks, Alaska Weather"
htmlUrl="http://www.rssweather.com/hw3.php?zipcode=99701" type="rss"
xmlUrl="http://rssweather.com/rss.php?hwvUT...ountry=us&county=02090&zone=AKZ222&alt=rss20a"/>
</outline>
<outline title="Unfiled">
<outline title="Boxes and Arrows"
htmlUrl="http://www.boxesandarrows.com/" type="rss"
xmlUrl="http://www.boxesandarrows.com/index.xml"/>
<outline title="CBB Plagiarism Project -"
htmlUrl="http://leeds.bates.edu/cbb/" type="rss"
xmlUrl="http://leeds.bates.edu/cbb/module.php?mod=node&op=feed"/>
</outline>
********************************************my script**
from xml.sax import make_parser
from xml.sax.handler import ContentHandler
class OPMLHandler(ContentHandler):
def startElement(self, name, attrs):
if name == 'outline':
self.title = attrs.get('title', '')
self.url = attrs.get('xmlUrl', '')
def endElement(self, name):
if name == 'outline':
print self.level, ':', self.title, '-', self.url
parser = make_parser()
curHandler = OPMLHandler()
parser.setContentHandler(curHandler)
parser.parse(open('blogs.opml'))
(see snippet below) and output an HTML blogroll.
I can get at all the actual blog entries with the code below. But the
problem is that I would like to get at the "folder" level ("Test and
Demo" and "Unfiled" below) which just has a name and no other
attributes so that I can group the entries in their "categories"
something like this (http://rpc.bloglines.com/blogroll?html=1&id=chrislott).
My code only lists the entries themselves...
********************************blogs.opml**
<opml version="1.0">
<head>
<title>Bloglines Subscriptions</title>
<dateCreated>Sun, 4 Apr 2004 20:15:17 GMT</dateCreated>
<ownerEmail>[email protected]</ownerEmail>
</head>
<body>
<outline title="Subscriptions">
<outline title="Test and Demo">
<outline title="del.icio.us/imao/Learning"
htmlUrl="http://del.icio.us/imao/Learning" type="rss"
xmlUrl="http://del.icio.us/rss/imao/Learning"/>
<outline title="Fairbanks, Alaska Weather"
htmlUrl="http://www.rssweather.com/hw3.php?zipcode=99701" type="rss"
xmlUrl="http://rssweather.com/rss.php?hwvUT...ountry=us&county=02090&zone=AKZ222&alt=rss20a"/>
</outline>
<outline title="Unfiled">
<outline title="Boxes and Arrows"
htmlUrl="http://www.boxesandarrows.com/" type="rss"
xmlUrl="http://www.boxesandarrows.com/index.xml"/>
<outline title="CBB Plagiarism Project -"
htmlUrl="http://leeds.bates.edu/cbb/" type="rss"
xmlUrl="http://leeds.bates.edu/cbb/module.php?mod=node&op=feed"/>
</outline>
********************************************my script**
from xml.sax import make_parser
from xml.sax.handler import ContentHandler
class OPMLHandler(ContentHandler):
def startElement(self, name, attrs):
if name == 'outline':
self.title = attrs.get('title', '')
self.url = attrs.get('xmlUrl', '')
def endElement(self, name):
if name == 'outline':
print self.level, ':', self.title, '-', self.url
parser = make_parser()
curHandler = OPMLHandler()
parser.setContentHandler(curHandler)
parser.parse(open('blogs.opml'))