PubSubHubbub is a Lot Easier Than It Sounds

I've begun digging into PubSubHubbub (PuSH), the real-time RSS update protocol created by Brad Fitzpatrick and Brett Slatkin of Google and Martin Atkins of Six Apart. I was under the impression that it's harder for RSS publishers to use than the RSSCloud Interface, but that isn't the case. The specification is simple and precisely written, adopting conventions like RFC 2119 that make a spec considerably easier to understand, and it communicates using basic HTTP requests.

PubSubHubbubI wrote the software that runs the Drudge Retort, so I decided to add PuSH support to it this morning to see how it works. PuSH delegates all the work required for update notification to a server called a hub. Google offers a hub at http://pubsubhubbub.appspot.com/ that's free for use by all feed publishers, so I'm relying on it.

First, I added a link element to the Retort's RSS feed that identifies the feed's update hub:

<atom:link rel="hub" href="http://pubsubhubbub.appspot.com" />

Because this element comes from the Atom namespace, I had to make sure it was declared in the feed's top-level RSS element:

<rss version="2.0"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns:sitemap="http://www.sitemaps.org/schemas/sitemap/0.9">

The bold portion is the Atom declaration. I already was using an Atom element in the feed, so I didn't need to change this.

When a new story is posted on the Retort, the PuSH hub must be notified that a change has occured. This is handled by sending a ping to the hub with the URL of one or more feeds that have been updated.

I've written an open source Weblog Pinger library in PHP, so I upgraded it to support these pings. A PuSH ping employs HTTP requests (REST) instead of XML-RPC, the protocol used by Weblogs.Com and similar services. I wrote a new function, ping_rest(), that can send a ping to any PuSH server.

By the time I was done, I'd spent an hour on the code and a few hours testing it out. So now when I post a new item on the Retort, Google's PuSH server sends the full text of the item to all readers that support the protocol. This is faster and simpler than RSSCloud, which tells readers to request the feed again.

To give you an idea of how fast PuSH can be, when I posted a new story on the Retort, it showed up 20 seconds later on FeedBurner, one of the first RSS services to support the protocol.

Comments

Rogers,

I've seen the abbreviation "PuSH" used for PubSubHubbubb.

I'm looking for a little programming project, adding PuSH support to my Blosxom blog sounds like a fun one.

Also like the abbreviation of 'PuSH', makes sense.

If anyone has a PuSH reader (that can receive updates), I'd love to check that out. Going to try to build one at some point...

A couple of requests.

1. Work through the rsscloud.org doc with the same approach you used here. An open mind, don't prejudge. I'd be interested to hear how your experience compares.

2. Review the docs and report any problems you see. There's a mail list or use the comments section on the walkthrough. As you've said, now is the time to fix problems. If you find them later it will be more difficult.

3. Also if possible review the Wordpress implementation and my rsscloud.root. I shipped source and it's in a Frontier tool, an environment you know well.

Rogers, nice review -- it's nice to see that it's pretty easy to integrate PuSH into a site. It's also exciting to think how great it is to watch PuSH slowly gather steam and demonstrate how well it scales as millions and millions of users come online.

Rogers,

Let me see if I'm getting the mechanics:
1. you create a post - i.e. pub(lish)
2. your blogging software sends a PuSH ping to the PuSH Hub you use
3. the PuSH Hub reads the feeds at the URL's in the ping?
4. the PuSH Hub sends a complete text update to "all readers that support the protocol".
5. readers of your blog that use one of these reader services see the new post in seconds.

Is that it? It takes the coordination 4 applications of:
1. publishing software (Pub)
2. hub software service (Hub)
3. a feed aggregator (Bub?)
4. a reader of the feed (Sub)

You coded 1 in 1 hour? Aren't you also the guy that re-coded the Weblogs.com ping server in a weekend in PHP (which was a new language for you)? For $5,000? I think you are... so modest. I know you lost the $5,000 but the gratitude of the services users is priceless.

If I tried to code #1 it might take a bit longer but the point is clear. The Spec's are unambiguous and as simple as they need to be to get the job done.

The obvious complexity is the Hub... millions of pings per second and users that will expect 20 second access to the reader. That's a serious responsibility. It WILL take a very solid company to keep that service running at scale. Or a really good start-up.

Oops.

Errata: (lowercase) bub

I coded the first part of the PuSH interaction -- the part necessary for a feed publisher to take advantage of the protocol. That's easier than being a hub or an RSS reader, two parts I hope to code soon.

I'd like to see PubSubHubbub address the NAT traversal problem.

GOOD POST!

thanks admin
information is the most beautiful treasures

Hi,

I begin learn PubSubHubBub, and have problems to get and process notifications from Hub.

Is possible to get a source code (php or similar) to manage updates from Hub (or link to an example) and show in a webpage ?

Regards,
Lukas

Hey, I'm having endless issues with pubsubhubbub.appspot.com. When trying to subscribe, I keep getting a 500 Internal Server error, yet if I test with HURL, it is fine! I'm totally baffled. Anyone else had those issues?

Add a Comment

All comments are moderated before publication. These HTML tags are permitted: <p>, <b>, <i>, <a>, and <blockquote>. This site is protected by reCAPTCHA (for which the Google Privacy Policy and Terms of Service apply).