Really Simple Syndication Specification
Editor's Note: This proposed specification provides completely new documentation for the Really Simple Syndication format, describing exactly the same elements and attributes delineated in RSS 2.0 Specification (version 2.0.8), published by the RSS Advisory Board on Aug. 12, 2006. Because this document has not been adopted by the board, current implementers should continue to rely on version 2.0.8. Public comments on this proposed specification are welcomed at RSS-Public. If adopted, this will become version 2.0.9 of the specification.
- 1 Introduction
- 2 Conventions
- 3 Data Types
- 4 Elements
- 4.1 rss
- 4.1.1 channel
- 188.8.131.52 description
- 184.108.40.206 link
- 220.127.116.11 title
- 18.104.22.168 category
- 22.214.171.124 cloud
- 126.96.36.199 copyright
- 188.8.131.52 docs
- 184.108.40.206 generator
- 220.127.116.11 image
- 18.104.22.168 language
- 22.214.171.124 lastBuildDate
- 126.96.36.199 managingEditor
- 188.8.131.52 pubDate
- 184.108.40.206 rating
- 220.127.116.11 skipDays
- 18.104.22.168.1 day
- 22.214.171.124 skipHours
- 126.96.36.199.1 hour
- 188.8.131.52 textInput
- 184.108.40.206 ttl
- 220.127.116.11 webMaster
- 18.104.22.168 item
- 4.1.1 channel
- 4.1 rss
- 5. License
- 6. Credits
- 7. To Do
Really Simple Syndication (RSS) is an XML-based document format for the syndication of web content so that it can be republished on other sites or downloaded periodically and presented to users.
RSS elements do not belong to a namespace. All elements in an RSS feed that are not defined in a namespace must be described in this specification. None of the restrictions described in this specification apply to elements or attributes defined in a namespace.
RSS feeds can be tested for validity in the Feed Validator.
In this specification, the key words may, must, must not, optional, recommended, required, shall, shall not, should and should not are to be interpreted as described in RFC 2119.
An RSS document is commonly described as a feed or newsfeed. Software designed to retrieve and present RSS feeds to users is called an aggregator, newsreader or reader. For clarity, this specification uses the terms feed and aggregator.
3. Data Types
The requirements for RSS element and attribute values are defined in the sections devoted to each element, aside from the following restrictions.
3.1 Dates and Times
All date-time values must conform to the RFC 822 Date and Time Specification with the exception that a four-digit year is permitted and recommended over a two-digit year.
<pubDate>Mon, 10 Oct 2005 14:10:00 GMT</pubDate>
<lastBuildDate>Mon, 10 Oct 2005 09:10:00 EST</lastBuildDate>
<pubDate>Mon, 10 Oct 2005 08:10:00 -0600</pubDate>
In all link and url elements, the first non-whitespace characters in a URL must begin with a scheme defined by the IANA Registry of URI Schemes such as "ftp://", "http://", "https://", "mailto:" or "news://". These elements must not contain relative URLs.
Because an aggregator may choose which URI schemes to support, publishers of RSS feeds must not assume that all schemes are available.
An Internationalized Resource Identifier (IRI) provides a means to identify Internet resources using non-ASCII characters that can't be present in URLs. All link and url elements must be valid URLs, so an IRI that contains non-ASCII characters must be converted to a URL using the procedure described in RFC 3987.
3.3 Character Data
For all elements defined in this specification that enclose character data, publishers should format the data as plain text with the exception of an item's description element, which must be suitable for presentation as HTML. All of these elements must not contain child elements.
Although some publishers employ HTML markup in other elements such as an item's title, using plain text in those elements achieves the widest support in aggregators.
There's no limit on the length of character data that can be contained in an RSS element.
An RSS feed consists of the following elements.
The rss element is the top-level element of an RSS feed. A feed that conforms to this specification must contain a version attribute with the value "2.0".
This element is required and must contain a channel element. The rss element must not contain more than one channel.
The channel element describes the RSS feed, providing such information as its title and description, and contains items that represent discrete updates to the web content represented by the feed.
The channel may contain each of the following optional elements: category, cloud, copyright, docs, generator, image, language, lastBuildDate, managingEditor, pubDate, rating, skipDays, skipHours, textInput, ttl and webMaster.
The preceding elements must not be present more than once in a channel, with the exception of category.
The channel also may contain zero or more item elements, which should appear after all of the other channel elements defined in this specification. Otherwise, the order of elements within the channel is not significant.
The description element holds character data that provides a human-readable characterization or summary of the feed (required).
<description>Current headlines from the Dallas Times-Herald newspaper</description>
The link element identifies the URL of the web site associated with the feed (required).
The title element holds character data that provides the name of the feed (required). If the feed corresponds directly to a web site, the name should match the name of the site.
The category element identifies a category or tag to which the feed belongs (optional).
This element may include a domain attribute that identifies the taxonomy in which the category is placed using a slash-delimited string that identifies a hierarchical position in the taxonomy.
A channel may contain more than one category element.
The cloud element indicates that updates to the feed can be monitored using a web service that implements the RssCloud application programming interface (optional).
The element must have five attributes that describe the service:
- The domain attribute identifies the host name or IP address of the web service that monitors updates to the feed.
- The path attribute provides the web service's path.
- The port attribute identifies the web service's TCP port.
- The protocol attribute must contain the value "xml-rpc" if the service employs XML-RPC or "soap" if it employs SOAP.
- The registerProcedure attribute names the remote procedure to call when requesting notification of updates.
<cloud domain="server.example.com" path="/rpc" port="80" protocol="xml-rpc" registerProcedure="cloud.notify" />
In this example, an aggregator could request notification by calling the "cloud.notify" method of the XML-RPC web service at server.example.com on port 80 using the path "/rpc".
This element is an empty element defined by a single tag and its attributes, unless extended by a namespace.
The copyright element declares the human-readable copyright statement that applies to the feed (optional).
<copyright>Copyright 2006 Dallas Times-Herald</copyright>
The absence of the copyright element in a feed does not mean that the feed is in the public domain.
The docs element identifies the URL of the RSS specification implemented by the software that created the feed (optional). The permanent URL for the specification you are reading is http://www.rssboard.org/rss-specification.
The generator element credits the software that created the feed (optional).
<generator>Radio UserLand v8.2.1</generator>
The image element supplies a graphical logo for the feed (optional).
<description>Read the Dallas Times-Herald</description>
The image's title element holds character data that provides a human-readable description of the image (required). This should be the same text as the channel's title element and is suitable for use as the alt attribute of the img tag in an HTML rendering.
The image's url element identifies the URL of the image (required). The URL must be in the GIF, JPEG or PNG formats.
The image's description element holds character data that provides a human-readable characterization of the site linked to the image (optional). The description is suitable for use as the title attribute of the a tag in an HTML rendering.
The image's height element contains the height, in pixels, of the image (optional). The image must be no taller than 400 pixels. If this element is omitted, the image is assumed to be 31 pixels tall.
The image's width element contains the width, in pixels, of the image (optional). The image must be no wider than 144 pixels. If this element is omitted, the image is assumed to be 88 pixels wide.
The channel's language element identifies the natural language employed in the feed (optional).
The language must be identified using one of the RSS language codes or a language code permitted by the World Wide Web Consortium for use in HTML. The U.S. Library of Congress publishes the current list of ISO 639 language codes adopted by HTML.
The channel's lastBuildDate element indicates the last date and time the content of the feed was updated (optional).
<lastBuildDate>Sun, 29 Jan 2006 17:17:44 GMT</lastBuildDate>
The channel's managingEditor element provides the e-mail address of the person to contact regarding the editorial content of the feed (optional).
<managingEditor>firstname.lastname@example.org (Jim Lehrer)</managingEditor>
The channel's pubDate element indicates the publication date and time of the feed's content (optional).
<pubDate>Sun, 29 Jan 2006 05:00:00 GMT</pubDate>
The channel's rating element supplies an advisory label for the content in a feed, formatted according to the specification for the Platform for Internet Content Selection (PICS) (optional).
<rating>(PICS-1.1 "http://www.rsac.org/ratingsv01.html" l by "email@example.com" on "2006.01.29T10:09-0800" r (n 0 s 0 v 0 l 0))</rating>
The channel's skipDays element identifies days of the week during which the feed is not updated (optional). On these days, the feed should not be requested by an aggregator. This element contains up to seven day elements identifying the days to skip.
The day element identifies a weekday in Greenwich Mean Time (GMT) (required). Seven values are permitted -- "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday" or "Sunday" -- and must not be duplicated.
The channel's skipHours element identifies the hours of the day during which the feed is not updated (optional). During these hours, the feed should not be requested by an aggregator. This element contains individual hour elements identifying the hours to skip.
The hour element identifies an hour of the day in Greenwich Mean Time (GMT) (required). The hour must be expressed as an integer representing the number of hours since 00:00:00 GMT. Values from 0 to 24 are permitted, with either 0 or 24 representing midnight. An hour must not be duplicated.
The textInput element defines a form to submit a text query to the feed's publisher over the Common Gateway Interface (CGI) (optional).
<description>Your aggregator supports the textInput element. What software are you using?</description>
The input form's description element holds character data that provides a human-readable label explaining the form's purpose (required).
The input form's link element identifies the URL of the CGI script that handles the query (required).
The input form's name element provides the name of the form component that contains the query (required). The name must begin with a letter and contain only these characters: the letters A to Z in either case, numeric digits, colons (":"), hyphens ("-"), periods (".") and underscores ("_").
The input form's title element labels the button used to submit the query (required).
The channel's ttl element represents the feed's time to live (ttl): the maximum number of minutes to cache the data before an aggregator requests it again (optional).
By convention, most aggregators check an RSS feed for updates once an hour. Aggregators may check a feed for updates more frequently than its ttl. When an aggregator's cached data for a feed is older than the feed's ttl, the aggregator should request the feed again rather than rely on cached data.
The channel's webMaster element provides the e-mail address of the person to contact about technical issues regarding the feed (optional).
An item element represents distinct content published over the feed such as a news article, weblog entry or some other form of discrete update. A channel may contain any number of items (or no items at all).
An item may contain the following child elements: author, category, comments, description, enclosure, guid, link, pubDate, source and title. All of these elements are optional but an item must contain either a title or description.
<title>Seventh Heaven! Ryan Hurls Another No Hitter</title>
<description>Texas Rangers pitcher Nolan Ryan hurled the seventh no-hitter of his legendary career on Arlington Appreciation Night, defeating the Toronto Blue Jays 3-0. The 44-year-old struck out 16 batters before a crowd of 33,439.</description>
An item's author element provides the e-mail address of the person who wrote the item (optional). A feed published by an individual should omit this element and use the managingEditor or webMaster channel elements to provide contact information.
<author>firstname.lastname@example.org (Joe Bob Briggs)</author>
An item's category element identifies a category or tag to which the item belongs (optional). The requirements are the same as the channel's category element.
An item may contain more than one category element.
An item's comments element identifies the URL of a web page that contains comments received in response to the item (optional).
An item's description element holds character data that contains the item's full content or a summary of its contents, a decision entirely at the discretion of the publisher. This element is optional if the item contains a title element.
<description>I'm headed for France. I wasn't gonna go this year, but then last week "Valley Girl" came out and I said to myself, Joe Bob, you gotta get out of the country for a while.</description>
The description must be suitable for presentation as HTML. HTML markup must be encoded as character data either by employing the HTML entities < ("<") and > (">") or a CDATA section.
Escaped markup created with character entities:
<description>I'm headed for France. I wasn't gonna go this year, but then last week <a href="http://www.imdb.com/title/tt0086525/">Valley Girl</a> came out and I said to myself, Joe Bob, you gotta get out of the country for a while.</description>
<description><![CDATA[I'm headed for France. I wasn't gonna go this year, but then last week <a href="http://www.imdb.com/title/tt0086525/">Valley Girl</a> came out and I said to myself, Joe Bob, you gotta get out of the country for a while.]]></description>
The description should not contain relative URLs. When a relative URL is present, an aggregator may attempt to resolve it to a full URL using the channel's link as the base.
An item's enclosure element associates a media object such as an audio or video file with the item (optional). The element must have three attributes:
- The length attribute indicates the size of the file in bytes
- The type attribute identifies the file's MIME media type
- The url attribute identifies the URL of the file
<enclosure length="24986239" type="audio/mpeg" url="http://dallas.example.com/joebob_050689.mp3" />
The enclosure element is an empty element defined by a single tag and its attributes, unless extended by a namespace.
For best support in the widest number of aggregators, publishers should not include more than one enclosure in an item.
Because some popular RSS implementations support multiple enclosures, aggregators should expect to encounter feeds where more than one enclosure is present in an item. Aggregators may use their discretion to handle all of the enclosures or just the first enclosure present within an item.
An item's guid element provides a string that uniquely identifies the item (optional). The guid may include an isPermaLink attribute.
The guid enables an aggregator to detect when an item has been received previously and does not need to be presented to a user again. If the guid's isPermaLink attribute is omitted or has the value "true", the guid must be the permanent URL of the web page associated with the item.
If the guid's isPermaLink attribute has the value "false", the guid may employ any syntax the feed's publisher has devised for ensuring the uniqueness of the string, such as the Tag URI scheme described in RFC 4151.
A publisher should provide a guid with each item. If the guid is a URL, it must conform with the specification's URL requirements.
An item's link element identifies the URL of a web page associated with the item (optional).
An item's pubDate element indicates the publication date and time of the item (optional).
If the publication date occurs in the future, aggregators may ignore the item until the date and time has passed.
<pubDate>Fri, 06 May 1983 09:00:00 CST</pubDate>
An item's source element indicates the fact that the item has been republished from another RSS feed (optional). The element must have a url attribute that identifies the URL of the source feed.
The value of the source is the title of the source feed.
<source url="http://la.example.com/rss.xml">Los Angeles Herald-Examiner</source>
<title>Joe Bob Goes to the Drive-In</title>
The RSS format was created by Dan Libby and Dave Winer.
Rogers Cadenhead, James Holderness, Randy Charles Morin, Sam Ruby and Greg Smith contributed to this document.
7. To Do
The following known issues have not been resolved as of this draft of the specification.
- Correct the channel and item category elements -- the value of the element should be a slash-delimited place in a taxonomy, not the value of the domain attribute.
This section will be removed upon the final publication of this document.