Black and white photo of 12 snowflakes from the Library of Congress taken by Theodor Horydczak to illustrate that all snowflakes are unique
Winter scenes: Snowflakes by Theodor Horydczak

If you publish an RSS feed, you should do a solid for the developers of RSS readers by including a guid in each item. The guid's job is to be a unique identifier that helps software downloading your feed decide whether it has seen that item before. Here's the guid for an item on the arts and technology blog Laughing Squid:

<guid isPermaLink="false">https://laughingsquid.com/?p=914660</guid>

No other item on Laughing Squid will ever have this guid value. It's a URL that loads a blog post with the title Playful Elephant Pretends to Eat Woman's Hat. If you load the guid's URL https://laughingsquid.com/?p=914660, it redirects to the permanent link of the post. Because the guid is not the permanent link, there's an isPermaLink attribute with a value of false.

Most guid values in RSS feeds are the permanent link of the item, as in this example from the world news site Semafor:

<guid>https://www.semafor.com/article/07/07/2023/us-jobs-data-what-experts-make-of-the-new-numbers</guid>

A drawback of using the permalink is that if any part of the URL changes -- such as the title text or the domain name -- the guid changes and RSS readers will think this is a new item to show the feed's subscribers, when it's actually a repeat.

A guid doesn't have to be a URL. It can be any string that the feed publisher has chosen to be unique. Here's the guid from the RSS Advisory Board's feed for this blog post:

<guid isPermaLink="false">tag:rssboard.org,2006:weblog.217</guid>

Our guid follows the TAG URI scheme, a simple way to assure uniqueness by putting these five components together in this order:

  1. The text "tag"
  2. A domain owned by the feed provider
  3. A year the provider owned that domain
  4. A short name for the feed different from any other feed on the site
  5. The internal ID number of the post

There's different punctuation between each component. The year 2006 was when the board began using the domain rssboard.org. No one else used that domain that year, so any feed reader that stores "tag:rssboard.org,2006:weblog.217" as this item's guid should never encounter that value in any other item on any other feed.

To see how RSS 2.0 feeds are using guid, several thousand feeds were downloaded this evening from an RSS aggregator that publicly shares the OPML subscription lists of its users.

Category Total Percentage
Total number of feeds 4,954 --
Feed using guid 4,777 96.4%
Feeds using non-permalinks in guid 752 15.2%

The term guid means "globally unique identifier," but RSS 2.0 does not require global uniqueness in guids. Because the TAG URI scheme does a good job of serving that purpose, Blogger, Flickr, MetaFilter, SoundCloud and The Register are among the sites using it in their feeds.

Comments

No comments yet.

:
:
:

Popular Pages on This Site