I became the first subscriber on Bloglines to the feed for the new White House web site, which launched at 12:00 p.m. as Barack Obama became the 44th president of the United States. As a syndication dork, I was interested to discover that the feed employs Atom as its format:
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>White House.gov Blog Feed</title>
<link href="http://www.whitehouse.gov" />
<updated>2009-01-20T12:05:25Z</updated>
<author><name>EOP</name></author>
<id>urn:uuid:ca4baafc-b6bc-45e5-9144-79c5289d9518</id>
<entry>
<title>A National Day of Renewal and Reconciliation</title>
<link href="http://www.whitehouse.gov/blog/a_national_day_of_renewal_and_reconciliation/" />
<id>urn:uuid:ca4baafc-b6bc-45e5-9144-79c5289d9518</id>
<updated>2009-01-20T17:01:00Z</updated>
<summary>President Barack Obama's first proclamation.</summary>
</entry>
</feed>
The Atom feed passes the Feed Validator, but there are four issues that trigger warning messages:
- Your feed appears to be encoded as "utf-8", but your server is reporting "US-ASCII" [help]
- Missing atom:link with rel="self" [help]
- Two entries with the same id: urn:uuid:ca4baafc-b6bc-45e5-9144-79c5289d9518 (4 occurrences) [help]
- Two entries with the same value for atom:updated: 2009-01-20T17:01:00Z [help]
When he has the time, President Obama can address these issues pretty quickly.
First, the XML element should reflect the actual encoding transmitted by the White House server:
<?xml version="1.0" encoding="US-ASCII"?>
Alternatively, the feed should be published using the UTF-8 encoding.
Next, the feed's link element must include an rel="self" attribute indicating that it's the feed's own URL:
<link rel="self" href="http://www.whitehouse.gov/feed/blog/" />
Finally, steps should be taken so that each feed entry has a unique ID. I recommend using the tag URI format, which for the White House could produce id elements like this:
<id>tag:whitehouse.gov,2009:1</id>
The final number in the id element should be a unique number, such as the index number of a blog entry.
The new White House site promises more feeds to come, but describes them as RSS feeds:
RSS is an acronym for Really Simple Syndication or Rich Site Summary. It is an XML-based method for distributing the latest news and information from a website that can be easily read by a variety of news readers or aggregators.
Either this is an error -- Atom feeds are not in RSS format, of course -- or Obama's effort towards national reconciliation includes the combatants in the RSS/Atom war.