RSS Aggregators work by polling a webserver on a regular basis to pull back the RSS file. While use of Last-Modified and etag can greatly reduce the overhead of this polling it may be possible to use a 'publish-subscribe' model to reduce this further.
This thought is inspired by a posting that my RSS Aggregator notified me about...
I've seen it estimated that a conditional HTTP GET on an RSS file takes about 200 bytes of bandwidth. That's not very much at all, even with a thousand clients polling once per hour the total bandwidth cost in a month will be about 137MB. It's still worth looking at alternatives though to see whether there is a more efficient way of being notified when a weblog is updated.
This thought is a proposed specification for a very simple web service that would allow clients to use a publish-subscribe model for receiving notification of changes.
This is a web service protocol based on the REST architecture. It's a server-to-server protocol for distributing notification of a change in a resource around the network.
A subscriber server sends HTTP POST messages to a publisher server which adds the subscriber to a list of interested parties. When a publisher wishes to indicate that a resource has change it will send out a HTTP POST to each subscribed server.
The subscriber server is intended to provide a way of cascading the notification of changes out to a client base. There are several ways this could be done:
The subscriber server could also provide value-added services such as routing some notifications to certain devices (e.g. football updates go to SMS)
It may be required to add a server callback mechanism ALA Jabber to reduce the possibility for abuse. It's intended that extra information be added to the <resourceChange> elements by placing new elements into a different namespace. This could be used to eliminate the need for a resource to be retrieved once it has changed, rather the message would contain the information.
POST to a URL the following XML:
<notification version="1.0" xmlns="http://www.owlfish.com/thoughts/cnws-2003-04-08.html">
<resourceChange>
<resourceID>http://news.bbc.co.uk/</resourceID>
<changeTime>Mon, 07 Apr 2003 22:59:33 BST</changeTime>
<title>Title of the change, e.g. new post title</title>
<description>A description of the change, e.g. excerpt from a post.</description>
</resourceChange>
</notification>
The response should be an XML document:
<notification version="1.0" xmlns="http://www.owlfish.com/thoughts/cnws-2003-04-08.html">
<result>
<status>true</status>
</result>
</notification>
If the response is false then remove this server from the list. If a HTTP error occurs then the server is removed from the list. If the host can not be contacted then an ageing process begins. Retries over a period of time should be performed, until such time as a configurable limit is reached - at which point the host is removed from the list. The publisher should keep retrying for a minimum of a 24 hour period, with a minimum of 1 retry at the end of that period.
The title and description are optional. If either is present then they must be markup free plain text in the same encoding as the rest of the XML message (e.g. utf8). The title can be up to 255 characters in length, the description can be of any length.
POST to a URL the following XML:
<notification version="1.0" xmlns="http://www.owlfish.com/thoughts/cnws-2003-04-08.html">
<subscribe>
<resourceID>http://news.bbc.co.uk/</resourceID>
<notificationURL>http://test.server.com:8888/</notificationURL>
</subscribe>
</notification>
The response should be an XML document:
<notification version="1.0" xmlns="http://www.owlfish.com/thoughts/cnws-2003-04-08.html">
<result>
<status>true</status>
<title>Title of the resource, e.g. weblog title</title>
<description>Description of the resource, e.g. weblog description</description>
</result>
</notification>
If the response is an error (HTTP, or FALSE) then the subscription is refused by the publisher. The title and description are optional. If either is present then they must be markup free plain text in the same encoding as the rest of the XML message (e.g. utf8). The title can be up to 255 characters in length, the description can be of any length.
If the subscriber is already on the list of interested parties for this resource then the the status should be true.
To unsubscribe from a resource notification list no action needs to be taken. When the next notification is received for a resource that is no longer of interest simply return 'false' as the status.
A server should normally only need to confirm (by re-subscribing) if there has been a service outage. When the service comes back on-line it should confirm all subscriptions if either:
If any publisher has not issued a notification for a given resource within 30 days the subscriber should try and re-confirm the subscription.
Is it worth pursuing this any further? Would it be useful, would it help? Should I write a reference implementation and try and test it out? Feedback to the address at the bottom of the page please!
The full list of my published Software
Made with PubTal 3.5
Copyright 2021 Colin StewartEmail: colin at owlfish.com