RSS and Normalization theory

The recent spat of criticism against the blogs for being bandwidth hog brings new realism as to where we are headed in the blogsphere. Microsoft stopped delivering the full text of postings on the MSDN blogs citing a bandwidth crunch (MSDN has 964 blogs as of today). Bloggers were already critical about Microsoft trimming the posts in the feed with couple of hundred characters. Why click on a link in your RSS aggregators to read the full post?
There are two ways to solve this problem:
Firstly, the MSDN RSS Feed is one gigantic feed covering all the recently updated posts. Dave Winer suggested that Microsoft cancel the aggregated feed — simply offer a feed for every blog. (Blogs.msdn.com is already offering individual feeds)
Secondly, extend the RSS specification and propose a normalization scheme for the data carried in RSS. The concept is very similar to the normalization done in databases.
Here’s an hypothetical case of normalization theory applied to RSS:

  1. The index.rdf contains all the elements except <description>
  2. A separate resource exists for the text content of <description>
  3. A new sub-element is introduced in <item> which is a reference for the separate resource for the contents of <description>

How this would work for a feed aggregator (on steroids)? The feed aggregator downloads the index.rdf as usual. The aggregator renders the content of the index.rdf by breaking down each of the elements. Since, the actual content (the entries in case of blogs) exist in a separate resource, the aggregator downloads the resource as required. In the next refresh, a local cache of the “seen” content does not require the resource to be downloaded, unless detected as modified in the index.rdf.

Comments are closed.