I switched to Atom 1.0 on Workbench two months ago, a move that hasn't been as smooth as I'd like because of one popular aggregator that doesn't support the format.
This site is created using Wordzilla, a LAMP-based weblog publishing tool that I've developed over the last year. Writing code to generate Atom feeds in PHP was extremely simple, since most of the code used to generate RSS feeds could be applied to the task.
Atom uses a different format for date-time values than RSS, so I had to write new date-handling code:
// get the most recent entry's publication date (a MySQL datetime value)
$pubdate = $entry[0]['pubdate']);
// convert it to an Atom RFC 3339 date
$updated = date('Y-m-dTH:i:sZ', strtotime($pubdate));
// add it to the feed
$output .= "<updated>{$updated}</updated>n";
This produces a properly formatted Atom date element:
<updated>2006-05-27T11:03:17Z</updated>
One thing I haven't been able to do with Really Simple Syndication is indicate an item's author, because RSS requires that an e-mail address be used for this purpose. Spammers snarf up e-mail addresses in syndicated feeds.
Atom supports author elements that can be a username instead:
<author>
<name>rcade</name>
</author>
The most significant difference between RSS and Atom is the requirement that Atom text elements specify the type of content that they hold, which can be HTML, XHTML or text.
The content type must be identified with a type attribute:
<content type="html"><![CDATA[I own some Home Depot stock ...]]></content>
My Atom feed offers the text of weblog entries as HTML markup:
// get the entry's description (a MySQL text value)
$description = $e['description'];
// add it to the feed
$output .= "<content type="html"><![CDATA[{$description}]]></content>n";
Putting this text inside a CDATA block removes the need to convert the characters "<", ">", and "&" to XML entities.
When an Atom element omits the type attribute, it's assumed to be text.
The following PHP code creates XML-safe text for entry titles:
// get the entry's title
$title = $e['title'];
// convert the title to XML-safe text
$title = utf8_encode(htmlspecialchars($title));
// add it to the feed
$output .= "<title>$title</title>n";
The last difference I had to deal with is Atom's requirement that each entry have a title. Because I haven't written titles for all entries on Workbench, I wrote a function that can create a title from the first 25 words of an entry's description:
function get_text_excerpt($text, $max_length = 25) {
$text = strip_tags($text);
if (strlen($text) <= $max_length) {
return $text;
}
$subtext = substr($text, 0, $max_length);
$last_space = strrpos($subtext, " ");
if ($last_space === false) {
return $text;
}
return substr($subtext, 0, $last_space);
}
I switched to Atom whole hog, dropping the RSS feed and redirecting requests to the new Atom feed.
I quickly reinstated the RSS feed because I'm getting 4,000 requests a week from subscribers running Radio UserLand, which doesn't support Atom 1.0. Trying to subscribe in the current version, Radio 8.2.1, results in the error message "Can't evaluate the expression because the name 'version' hasn't been defined."
That's the only popular aggregator I've tested that doesn't support Atom 1.0, though I've read that the OPML Editor's River of News also can't handle these feeds.
I'm not going to support both formats on new programming projects just for Radio, because its users ought to nudge UserLand to upgrade Atom support to version 1.0. I'd like to redirect RSS requests to the Atom feed so that all subscribers are seeing the same thing and sites like Bloglines offer one subscription count. But dropping existing RSS support makes little sense.
Atom's content type requirement is a great improvement to syndication, allowing publishers to specify exactly what they're using a feed to carry. The RSS engine built in to Microsoft's next version of Windows produces RSS 2.0 feeds that have an extra type attribute in each description, even though it's not defined in the spec.