Improving Movable Type's RSS 2.0 feed

People involved in RSS and Atom development spend too much time bickering about the past and too little trying to move things forward in a constructive way. I'm as guilty of this as anyone -- there's only a small group of cranks who understand this stuff well enough to get angry about it, and I like arguing with most of them.

Now that I'm becoming familiar with Movable Type for reasons I can't talk about yet, I can wade into the long-running controversy over the software's default RSS 2.0 template.

Instead of doing that, though, I'm working on an RSS 2.0 template that I would like Six Apart to adopt in its software. I hope I can find some RSS 2.0 advocates and Movable Type webloggers to look it over and provide feedback. When it's done, I'm going to submit it to Six Apart with anyone else who'd like to sign on.

My first draft of the template differs from the existing RSS 2.0 template in several ways:

  • All namespaced elements are dropped.
  • Channel and item dates are expressed with pubDate, not dc:date.
  • The channel element generator replaces admin:generator.
  • Each item's description element uses the full weblog post containing encoded HTML, not an excerpt with the HTML stripped out.
  • Each item's guid element is now a permalink.

I can explain these decisions in comments to this entry, if there's interest. The big one is replacing namespace elements with core elements from RSS 2.0, which I believe is a better practice for simple syndication.

Here's example output for the old and new templates.

My template makes use of a Movable Type feature that hasn't made it to the User Manual yet: the MTBlogTimezone tag has a no_colon attribute that can be used to drop the colon from the output, making it possible to create timestamps in the RFC 822 format required by RSS 2.0.

Comments

The date thing: yeah, you could add rfc822="1" and just have _hdlr_date override whatever's passed to it, and call the date function with the 822 format and language="en" no matter what format or language is passed in whenever ($args->{rfc822}), but that seems a bit confusing in that your attribute would be magically overriding other attributes: <MTDate format="%Y" language="de" rfc822="1"> would actually produce a full RFC822 date in English. I'd be more inclined to add two new tags, MTDateRfc822 and MTEntryDateRfc822, instead. But, it's Perl, so there's a million ways to do it with no clear way of saying what's best.

I'm trying to get a copy of the final template version Six Apart settled on. When I do, I'll put it up along with example output.

People wonder why I don't offer any RSS feeds now.

And here I figured it was an effort to encourage adoption of Atom. Boy do I look foolish!

Excellent, but already done.

You'd think that because there's been almost a year since Brad Choate published his template that Six Apart would make some moves towards full, valid RSS 2.0 feeds using existing elements, but by the looks of it, excerpted, namespaced RSS 2.0 feeds will be a "feature" of Movable Type 3.0 as well.

I don't get it; since their default RSS2 template is, by this point, absolutely perfectly displayed by any newsreader and is perfectly valid RSS, why should they waste their time or the time of RSS reader programmers shaking potential problems out of a new style? There's an argument, analogous to the one often given for freezing RSS2, to be made declaring their default, flawed-but-extremely-well-supported (sound familiar?) template frozen while allowing people who are dissatisfied with it easy access to change it to their needs, as you have.

And really: "reasons [you] can't talk about yet"? That's not nearly coy enough; obviously, that means there's a Movable Type Quickstart Guide coming, no? ;)

Outside of Movable Type, I'm not aware of any RSS 2.0-producing software that uses dc:date instead of pubdate, admin:generator instead of generator, and four separate namespaces.

It's valid -- I'm not questioning that -- but I think it's better practice in RSS 2.0 to use core elements for those purposes. If namespaces are employed, they should offer something that does something new in RSS 2.0.

As for freezing the feed, if my new template is valid RSS 2.0 (and it is), newsreaders should have no problems handling it.

That's not the point. It doesn't matter how other RSS2 producers produce RSS2; it matters how well the RSS2 consumers can understand it. And the old one's been around long enough that every reader can understand what dc:date and admin:generator mean just fine.

if my new template is valid RSS 2.0 (and it is), newsreaders should have no problems handling it.

That reminds me of the old quote: "In theory there's no difference between theory and practice. In practice, there is." I can write HTML that's perfectly valid yet displays incorrectly in every major browser without even raising a sweat.

Why should SixApart even bother putting their customers through a potential headache for a change that is essentially invisible? They know, in practice, that the old one works acceptably, and the spec it's based on is frozen, so it should work forever. That's a pretty good argument for leaving well enough alone.

BTW, that was supposed to be Movable Type Kick Start. Sorry about that, and congratulations.

Prior to the last ten months of threats and insults, I'd guess that if you managed to get the attention of someone at 6A (never an easy task) they probably would have said "whatever" to the funky elements, and only been interested in (and opposed to) the full entry part. Have a search through the MT forums, and you'll find plenty of irate users who've discovered that the damn "syndicate this site" link lets people steal their content. As it is, all it takes is a tiny template edit to get a full content feed (and MT users are much more likely to edit their templates than users of some other blogging programs), and with all the noise they've gotten from people already I doubt you'll sell that part.

Defunkifying, at this point? I'd guess you would have to present a list of the aggregators that are able to use item-level pubDate, but aren't able to use item-level dc:date. I'm looking forward to seeing that list (he said, with an evil glint in his eye and some suspicion about what will show up there and what won't).

If you only use MTEntryBody, then you not only don't syndicate the extended entry, you don't even give any indication that it exists. You can either just throw MTEntryMore in there (though I seem to remember some aggregator that had trouble with multiple CDATA sections in a single element), or some people use it for things they don't want to syndicate, and just use [MTEntryIfExtended]more...[/MTEntryIfExtended]

Hmm, I'd forgotten over the last year, but you do need to use language="en" on the MTEntryDate tags, so that furriners will generate a proper RFC822 date with proper English month and weekday names (why do people include the optional weekday name, when they are generating it in code, for code to read?).

Also, you probably want MTEntryPermalink rather than MTEntryLink: they are the same for individual entry archives, but Permalink includes the anchor if the preferred archive type is Monthly or Weekly while Link doesn't. Dunno how it's managed to stay Link in the default template all this time, but I'm off to report it as a bug against the beta.

Well, then you better get on it if you want it in there. A new template only affects new installs and newly created blogs, and will only get distributed with a new version (other than the few people who need to copy it off the website for some reason), and they sound like they are getting fairly close to shipping (in fact, might be a bit late, now, since you would think they would want to run it through the beta testing). Maybe 3.1 will come along fairly quickly after that, but I'm not counting on it.

Mine may be out of date by now, but here's what I'm running on...
http://grumet.net/weblog/archives/2003/08/21/an_even_nonfunkier_rss_2_template.html

Why not the attribute permalink=true in guid?

According to the RSS 2.0 specification for guid, the "isPermaLink is optional, its default value is true." Dave Winer mentioned this as a bandwidth saver.

This is all fine but please, none of the DW "i tried to help back in '04 but they wouldn't listen" crap. If they don't take your olive branch, move on.

I don't think an RSS 2.0 template should remain frozen if it can be improved (and I doubt there's full support for dc:date and admin:generator, but I'll do some digging).

If Six Apart doesn't want an RSS 2.0 template that's been vetted by RSS 2.0 geeks, that's their prerogative. I bet they'd be glad to have the assistance, since they're more comfortable with Atom.

As for the template, thanks for the suggestions so far. I'm going to tackle them one by one as I examine them.

I've added language="en" to the two date elements and rebuild the sample XML feed. I didn't think about RFC 822 requiring English day-of-week and month names, but I rechecked the RFC and you're right.

I just received an e-mail from Six Apart; they're receptive to this effort. Please keep that in mind as you decide whether to vociferously object to any of my bad ideas.

Please explain the need for formatting instructions within the description tag?

By formatting instructions, do you mean the encode_html="1" attribute? That makes it possible for HTML produced by the MTEntryBody tag to be placed in a description element.

Come to think of it, you better throw in a patch to Context.pm for an MTEntryRfc822Date. Most people publishing in other languages will fail to ever look at the RSS template and not change from the hardcoded en-us language, but those that do will probably also change the language for the date, and wind up with a completely unusable date.

(Must bite tongue. Won't bite tongue. Must bite tongue.)

How could a patch fix that -- with some kind of rfc822="1" attribute?

Why have both guid and link? I can understand if the guid were not a permalink, but in this template the guid is. As examples, you won't find links in either Dave Winer's or my RSS 2.0 feeds.

As has already been indicated, some people prefer not to syndicate full content - either for bandwidth reasons or in order to draw people to their site. But there is an additional reason IMHO to keep description to the original intent of being a synopsis: bloglines has a preference option allowing the user the option of seeing the full item or the summary.

There also are a number of uncontroversial namespaced elements that add value: slash:comments, and wfw:commentsRss. With these elements in place, a SharpReader or RssBandit user could follow the comments on your weblog, for example. Additionally, there is the trackback:ping element.

I hadn't thought of dropping link. Wouldn't that lose some portion of the audience that doesn't support guid yet?

Anil Dash of Six Apart contacted me and a few other people last night about what we thought of Brad Choate's RSS 2.0 template. Because it's pretty good and Six Apart was eager to move on this, I recommended adopting it with four minor tweaks:

1. Add the channel element docs:

<docs>http://blogs.law.harvard.edu/tech/rss</docs>

2. Drop these two channel elements:

<managingEditor><MTEntries
lastn="1"><$MTEntryAuthorEmail$></MTEntries></managingEditor>
<webMaster><MTEntries
lastn="1"><$MTEntryAuthorEmail$></MTEntries></webMaster>

In a multiple-authored weblog, there's no assurance the author of the most recent item is the managing editor or webmaster. Also, it puts a plaintext e-mail address in the feed, which some MT users will hate. The elements are optional, so they can be omitted.

3. On this line:

<guid isPermaLink="true"><$MTEntryPermalink
encode_xml="1"$></guid>

You can remove the isPermalink="true" attribute. It's the default for the guid element.

4. For this item element:

<description><$MTEntryExcerpt encode_xml="1"$></description>

I would use encode_html=1 instead of encode_xml=1.

Brad's template uses excerpted entries instead of entry bodies, which appears to be more popular if this discussion is any indication.

Regarding namespaces, I'm not opposed to including them. Are any of them so well-supported that it's a crime to leave them out of an RSS 2.0 feed?

I would also drop the copyright element (you last needed to state "Copyright {date}" in 1989), and the ttl element unless I'm wrong about hardcoded data there spoiling the idea the way it does in other scheduling elements. Is anything using it, and if so do they use it for anything other than "author's opinion about how often feeds should in general be fetched"?

Rather than MTEntryCategory, which is just the primary category, it should probably be

<MTEntryCategories>
<category><MTCategoryLabel></category>
</MTEntryCategories>

to get all the categories for the entry (poor RSS 1.0 with its no repeats rule ;)).

Along with slash:comments (you can't really include wfw:commentRSS since it involves also including a separate individual entry template to produce an RSS feed of the comments), wouldn't it be good to include the core comments element?

<MTEntryIfAllowComments>
<comments><$MTEntryPermalink$>#comments</comments>
<slash:comments><$MTEntryCommentCount$></slash:comments>
</MTEntryIfAllowComments>


Might as well include the Creative Commons license, too

<MTBlogIfCCLicense><creativeCommons:license><$MTBlogCCLicenseURL$></creativeCommons:license></MTBlogIfCCLicense>

As to link:guid, I'd say based on

"An item may represent a "story" -- much like a story in a newspaper or magazine; if so its description is a synopsis of the story, and the link points to the full story. An item may also be complete in itself, if so, the description contains the text (entity-encoded HTML is allowed), and the link and title may be omitted."

that if you only have the excerpt in description, then you should have the permalink in link, if you have the full entry in description you may omit link. It's a fuzzy point, and since everything's optional you may do whatever pleases you, but I would say that the intention is that you only omit permalink-in-link when the description includes the full item, not when content:encoded or xhtml:body includes it. A reader which only knows the spec, not any extensions, could then display whatever's available of title/description with link linked from "[more]" when it's present, and skip the link knowing there's no more when there isn't any link element, while still showing guid as a permalink for the item.

So, if you did push through putting the full item in description, you could then omit link, but I know without question that there would be people who would change you MTEntryBody back to MTEntryExcerpt, and they almost certainly wouldn't know that making that change would then mean that they should also restore the link element.

Ah, recommendED, settlED, past tense. I'm a day late and a suggestion short again. Never mind, I'll just go back to telling people to change the default template.

I'm guessing they settled on one. I haven't heard yet.

Phil: FeedDemon watches for [ttl], and as I understand it, uses it as a minimum update frequency... if I set [ttl]60[/ttl], then FD will never automatically fetch the feed more than once an hour, no matter what the user's settings.

> since Mark Pilgrim will probably quote me on this five years from now

Yes, well, God forbid someone should try to hold you responsible for your own words, Dave.

Again, I reiterate that you praised my Dublin-Core-enhanced RSS 2.0 template in September of 2002. I used the template on my own site, and Radio (and every other tool on Earth) read it just fine. It was so uncontroversial that we put it in the Feed Validator documentation, from which 6A took it and put it into Movable Type. As Phil as correctly pointed out here and elsewhere, that feed has provided no interoperability problems for anyone.

Oddly enough, the only massive outcry I've ever received about the interoperability of my RSS feed was when I followed Dave's advice (and Dave's example), dropped link from my templates, and started using the guid element as a permalink. I was reviled by virtually everyone -- including Dave, Rogers, and several other prominent people in the RSS camp -- for doing exactly what Dave had done to his own feed months earlier.

People wonder why I don't offer any RSS feeds now. It's too much work to keep up with you, Dave; you keep changing the rules!

I'll try to circulate our RSS 2.0 template for comment as soon as we're a little less busy just working on the product itself. Thanks to everyone who's welcomed our attempt to work on this stuff.

http://www.intertwingly.net/blog/1460.html#c1055933609

Add a Comment

All comments are moderated before publication. These HTML tags are permitted: <p>, <b>, <i>, <a>, and <blockquote>. This site is protected by reCAPTCHA (for which the Google Privacy Policy and Terms of Service apply).