RSS and namespaces

Pete · Nov 9, 2006

I've been playing with writing my own RSS reader (in Ruby -- yes, I
know there are lots out there, but it's mostly for my own experience
and amusement). I tailored it to read the XML from the BBC feeds,
but it's been happy with others as well (occasionally needing minor
tweaking because of missing elements).

For interest, though, I downloaded a 'Have Your Say' forum RSS feed,
again from the BBC, and initially all I got in the HTML conversion
was a bunch of little titles, all the same... [That similarity was
expected because they're just the title of the thread.]

When I looked in detail at the source, I saw that, first, the 'description'
element was in CDATA form, which my reader wasn't set up for. I suppose
this makes sense, because some of the descriptions included HTML tags.
(Though looking at the 'XML FAQ' I see that the contents of CDATA are
*not* supposed to be inviolate from translation into entities, so I
gather that if I was using say XSLT for the conversion it wouldn't
work right anyway.)

However, the other strangeness was that important items like the
author and creationDate were in the 'jf:' namespace ("JiveSoftware").
This meant that I had to specifically add these tags to my reader
to be able to see this data. (The <item>s didn't have any <pubDate>
element at all.)

Isn't this sort of against the intent of RSS? Is this a case of
proprietary selfishness, or is there some good reason for using
a namespace to handle important generic parts of the content?
I see that some other feeds use 'itunes:' or other spaces, but
they seem to be for more specific data and don't really impact
reading by an app that doesn't know about them.

-- Pete --

Andy Dingley · Nov 10, 2006

Pete said:
When I looked in detail at the source, I saw that, first, the 'description'
element was in CDATA form,

There's a famously excellent article on DiveIntoMark.com that talks
about HTML encoding in RSS and the multiplicity of RSS versioning.

Isn't this sort of against the intent of RSS?

Sort of. It's not against RSS to do this (it's just adding a new
element), but it is against RSS to drop the important and useful
element in the default namespace. The fallacy is to think that they're
the same thing, or to produce a feed with <jf

ubDate> in it and think
that by doing this you're also implementing <pubDate>

Then again, there's never any shortage of Stupid in the RSS world 8-(

You might be interested to read some background on Dublin Core, in
particular the old "dumbing down" principle. DC has de-evolved a bit in
recent years but it used to have a very clear and very good strategy
for ad hoc extensions like this. You were allowed (and encouraged) to
add anything you liked, with the proviso that the qualification
mechanism (how you refine a standard general-purpose property into a
more specific one) will be unreliable for clients who don't understand
the new more-specific form. Provided that the new more-specific form
was still "meaningful" (or at least not misleading) in the general
context, then you were encouraged to add it anyway. The core element is
"creator" and refined forms might be "author", "editor" or "translator"
-- all of whom are still validly regardable as creators. Less capable
clients would only see the "dumbed-down" general form, but it's still
useful to them, for as great a value of "useful" as they're capable of
grasping.

Pete · Nov 10, 2006

There's a famously excellent article on DiveIntoMark.com that talks
about HTML encoding in RSS and the multiplicity of RSS versioning.

Thanks for the pointer. It's clear that there's a lot more disarray
in the field than I anticipated!

-/)

And more specifically I see in several places that HTML within a
description is supposed to be entity-escaped. No mention of CDATA
anywhere... Eek. (And then there's the danger of arbitrary HTML...)

Sort of. It's not against RSS to do this (it's just adding a new
element), but it is against RSS to drop the important and useful
element in the default namespace. The fallacy is to think that they're
the same thing, or to produce a feed with <jfubDate> in it and think
that by doing this you're also implementing <pubDate>

Then again, there's never any shortage of Stupid in the RSS world 8-(

Looks like it. I'm sort of glad that I tend to write my own apps for
things like this -- and take a minimalist approach. Ignore everything
I'm not specifically interested in. I may have to keep tweaking every
time I find a new feed, though...

[Dublin Core snipped]

Skimmed some of the stuff on that. As you say, extending, rather than
replacing, seems the obvious way to go.

Cheers,
-- Pete --

PHP RSS Feed Aggregator changing to todays date everytime feed is aggregated	1	Jan 11, 2022
monster rss feeds	1	Sep 16, 2012
namespaces and xpath queries	3	Mar 31, 2010
Two ways to generate RSS - rss/maker and rss/2.0 - which is better?	1	Jun 26, 2009
#include and namespaces	20	Mar 1, 2012
What is a qualified name (Namespaces in XML) ?	7	Oct 11, 2012
RSS feed questions	0	Jul 11, 2007
RSS in ASP file - How to make this work in IE?	1	Jan 24, 2008

RSS and namespaces

Pete

Andy Dingley

Pete

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads