RSS Best Practices Profile Published

The proposal to endorse and publish the RSS Profile has passed 8-1 with RSS Advisory Board members Christopher Finke, James Holderness, Eric Lunt, Randy Charles Morin, Paul Querna, Jake Savin, Jason Shellen and myself voting in favor and Matthew Bookspan voting against.

The RSS Profile makes it easier for feed publishers and programmers to implement RSS 2.0, offering advice on issues that arise as you develop software that employs the format. For 18 months, the board worked with the RSS community on interoperability issues, receiving help from representatives at Bloglines, FeedBurner, Google, Microsoft, Netscape, Six Apart and Yahoo. The profile tackles the most frequently asked questions posed by developers:

  1. How many enclosures can an item contain?
  2. Are relative URLs OK in item descriptions?
  3. Is it OK to use HTML in elements other than an item's description?

For the answers, read the sections on enclosures, item descriptions and character data, respectively.

Sam Ruby announced this morning that the Feed Validator now tests for conformance to the profile, offering 11 new checks for improving interoperability.

If you'd like to comment on the profile and the new validator checks, post on the mailing list RSS-Public.

As part of the vote, the following sentence has been added to the About this document section of the RSS specification: "The RSS Profile contains a set of recommendations for how to create RSS documents that work best in the wide and diverse audience of client software that supports the format." No other changes were made and all edits to the specification are logged. This revision of the document has the version number 2.0.10.

With the publication of the profile, the board is eager to work with companies and individual developers on the adoption of its recommendations and is looking for people who can write foreign language translations of the document, which has been released under the Creative Commons Attribution-ShareAlike 2.0 license.

Not That There's Anything Wrong With That

Today's tip from computer book author and technology expert Rogers Cadenhead (i.e. me): When signing up for a social networking service such as Facebook, pay careful attention to questions involving gender when setting up your personal profile.

I just discovered, in my own Facebook profile, that I'm interested in men:

Rogers Cadenhead Facebook profile

Apparently, when I joined Facebook in May I got confused over a question involving gender, thinking it was asking for my own. Because I had no takers, I didn't realize the mistake for months. I have corrected the error, but I'd like to take a moment to thank Steve Kirks, Frank Paynter, Rick Scully and my other 12 Facebook friends, who accepted me for who I am -- even though I'm not.

Web Traffic Counts: Is Compete Any Good?

I occasionally cite web traffic stats from Alexa and Compete, two services that measure traffic across the entire web. It's probably worth pointing out that I have no idea at all whether they're accurate. Compete publishes a monthly count of site visitors based on data from two million U.S. Internet users, so I can compare its numbers directly to the stats I get from Google Analytics. Since the latter is based on actual traffic, it's a reliable metric.

For the Drudge Retort, Google Analytics reports 337,985 U.S. visitors in September and Compete reports 42,815 people for the same period (12.7 percent of Google's total).

On SportsFilter, Google Analytics reports 263,677 visitors and Compete reports 67,906 people (25.8 percent).

On the soon-to-close Cruel.Com, Google Analytics reports 109,334 visitors and Compete 22,028 people (20.1 percent).

I wouldn't expect these numbers to be the same, because every web stats program has different methodology for counting eyeballs. But if Compete was measuring my U.S. traffic accurately, I'd expect the percentages to be close. They're all over the place.

Techmeme: We Find the Sites You Already Visit

Tim Bray on Techmeme:

I go there and see the same stories about the RIAA and Paul Graham's latest essay and what Apple might be doing, the same stories that are on Slashdot and Ars Technica and boring old ZDnet too. Plus a smattering of whatever Scoble & Winer & Arrington & Calcanis and their posses are up to.

For all of the attention paid to the Techmeme leaderboard this week, the latest popularity contest for self-fascinated, high-traffic techbloggers, there hasn't been much scrutiny of the manner in which Gabe Rivera creates his site. Techmeme, which publishes a software-generated roundup of tech news based on links stories receive from favored sources, isn't entirely automated. Rivera begins with a "seed list" of hand-chosen sites, as he explained to Wired News earlier this year:

I do use lists of sources to help my system determine which sources to monitor. Essentially, I'm telling it to "find more sites like these." These aren't exhaustive lists, or even close to exhaustive, and therefore not "white lists." ...

The full set of sites it monitors is constructed automatically, and even changes in real time based on linking. A small "seeding" list I construct manually is used to help the system build the complete list.

Rivera's good at making it sound like an egalitarian discovery process is going on, but Techmeme isn't exactly Lewis and Clark heading off into uncharted territory with a blank piece of paper and a pencil. The site's About page breathlessly declares, "At this moment, the next big story in technology may reside on a blog you've never heard of or a news site you don't have time to scan." Or it may reside on Engadget and TechCrunch, sites discovered 42 times on Techmeme the past week alone.

The Techmeme I want is one that identifies the 100 most-linked sources in technology, then pretends they don't exist. Show me the blogosphere that would exist if Robert Scoble finished journalism school, Mike Arrington remained in the domain name trade, Jason Calacanis became a psychologist and I pursued a career in modern dance.

There's an element of democracy in Technorati rankings and Google pagerank, since they're based on incoming links and the rank of those linkers. TechMeme's leaderboard, on the other hand, is determined by the sites Rivera chooses for his seed list and the stories they link. If he published that list, I expect you'd find the same people and publications who end up on the leaderboard. What goes in one end comes out the other. If you put turkey between two slices of bread, you get a turkey sandwich.

Game Over for Checkers Hall of Fame

Roadside America, a site devoted to the cheesiest tourist attractions in the country, reports the sad news that the International Checker Hall of Fame in Petal, Miss., was destroyed by fire 10 days ago:

On September 29, 2007, a still-unexplained fire started in the tower and quickly engulfed the rest of the Hall. Everything: the giant checkerboards, the library, the statue, was destroyed. "What has been lost is one of the finest checkers collections the world has ever known," said Don Deweber, director of the World of Checkers Museum. "It is almost all irreplaceable."

Mississippi TV station WJTV has video of the house. Curiously, the station touts founder Charles Walker's charitable works without mentioning a word about his checkered past -- he's serving a five-year federal prison sentence for money laundering.

In the TV report, somebody named Scott Waldrop credits checkers legend Marion Tinsley with "some of the first algebraic equations." I have no idea what he means -- algebra has been around for 12 centuries -- but Tinsley was a Florida math professor who had an unbelievable mind for the game. When scientists at the University of Alberta announced that they had solved checkers after 18 years, which means no human can ever beat their software playing the game, they analyzed thousands of moves played by Tinsley and found only a few mistakes. Most of the time, he played the game as perfectly as their proof.

FeedBurner, Uncertainty and Doubt

On Scripting News today, Dave Winer writes that he can't trust FeedBurner:

If things were different I might use Feedburner. Especially on weekday mornings it's amazing how much traffic one file, my RSS 2.0 feed, gets. So it occurs to me that I could streamline things simply by offloading that file to Google. Now that they own Feedburner, this is something I might do, if they take a pledge not to break aggregators that depend on the format of my feed not changing. If someday my feed were to change format and break just one person reading it, I would consider that a serious support issue. It's not something I want to take a chance with. Some people trust me in this way. Not so many people as Google, but to me, they're very important. Could I delegate that trust to Google? No, not at this time.

He's laboring under the misconception that FeedBurner has taken sides in the RSS/Atom war. That's not the case at all. Since the day it was launched, FeedBurner has been Syndication Switzerland. The service won't break aggregators that require a specific format to work properly. In fact, it will even convert feeds from one format to the other dynamically so that they work.

For example, the AmphetaDesk aggregator, one of the earliest desktop feed readers for Windows, doesn't support Atom 1.0. When I tried to add an Atom 1.0 feed to AmphetaDesk, it failed with the error "AmphetaDesk could not determine the format."

I added this feed to FeedBurner, producing a new feed. The feed, when loaded in a browser, is in Atom 1.0 format. But when I subscribe to it in AmphetaDesk, it works without error. A feature called SmartFeed, which is available in FeedBurner's Optimize menu, detected that the request was coming from AmphetaDesk and converted the feed accordingly.

I can understand being cautious about adopting a third-party web service. I tried BlogRush recently and it was a swift and terrible disaster on par with Eddie Murphy's singing career.

But in this case, FeedBurner's painstaking efforts to make feeds work, regardless of which software or feed format are employed by web publishers and readers, are getting lost in the FUD. And that burns me a little.

RSS Best-Practices Profile Up for a Vote

For the last 18 months, the RSS Advisory Board has been drafting a set of best-practice recommendations for RSS. Working with the developers of browsers such as Microsoft Internet Explorer and Mozilla Firefox, aggregators such as Bloglines and Google Reader, and blogging tools including Movable Type, we've looked for areas where questions about the RSS format have led to differences in how software has been implemented to produce and consume RSS feeds.

The result of our work is the RSS Profile. The lead authors are James Holderness, Randy Charles Morin, Geoffrey Sneddon and myself. The profile isn't a set of rules; it's a set of suggestions drafted by programmers and web publishers who've been working with RSS since the format's first release in 1999. Our goal is for the profile to be the second document programmers consult when they're learning how to implement RSS.

The profile tackles some long-standing issues in RSS implementation, including the proper number of enclosures per item, the meaning of the TTL element and the use of HTML markup in character data.

In addition to recommendations for the RSS elements documented in the specification, the profile includes advice for four common namespace elements: atom:link, content:encoded, dc:creator and slash:comments.

Morin and I have proposed that the board endorse and publish the RSS Profile, making it available under a Creative Commons Attribution-ShareAlike 2.0 license so that others can build upon and extend it with their own recommendations.

Additionally, we proposed that the following sentence be added to the About this document section of the specification, as a new fifth paragraph: "The RSS Profile contains a set of recommendations for how to create RSS documents that work best in the wide and diverse audience of client software that supports the format."