June 11, 2005

WhatWG and the event-source design

The WhatWG is a set of individuals and reprentatives of 'browser manufacturers' that are defining the evolution of HTML markup to help Web application developers. They call themselves 'unofficial', but most people realize that rough consensus and working code can create more momentum than 'official' channels.

They have recently described an extension to HTML allows a Web browser to receive a stream of events from a server - the event-source tag. This is a wonderful concept and very similar to work done by KnowNow, mod-pubsub and others over the past few years. From my experience with these other approaches, I know that truly amazing things will happen when this becomes commonplace and trivial to use.

I had looked at the WhatWG when it first started, but soon stopped following their development. From reading the current draft of their description of this HTML extension, I can forsee several technical issues and now I wish I had stuck with monitoring their development in order to contribute to this section. Hopefully it's not too late.

There are several technical issues and design approaches that are interesting, but I can't find much background or discussion of these alternatives, so I may be covering old ground with this post.

The different areas I am interested in are:
- connecting elements, event streams and event handlers
- format and definition of event streams

1) connecting elements, event streams and event handlers
From the section "9.1.1. The event-source element" the approach is to introduce a new element. I'd like to consider whether more than one event-source element is allowed, and also consider whether simply introducing new attributes on existing elements is easier and feasible.
If a document is allowed to have more than one event-source element then the client app will have to deal with either multiple connections or combining multiple event-sources on one connection while delivering events to the corresponding event-source element event handler. If multiple connections are supported, then the client application could saturate the capability of the client machine and could even be considered a 'poorly behaved' client on the shared network.

An alternative to a new element would be to add new attributes, for example:
<p event-src="/my/stocks/amzn/" onMessage="handleQuotes()" />

I suggest using onMessage rather than onEvent to distinguish between network based messages and application based events (e.g. connection-opened, connection-closed-by-client, connection-closed-by-server, etc). I realize that some people feel web developers would want these to look identical, but many years of experience across the software industry has shown that they are simply different beasts and making that explicit actually helps the developer. The onEvent handler could be used for connection events on the event-source outside the actual messages within that stream.


2) format and definition of event streams
Section "9.1.3 Processing model" defines how connections should be handled, and it's great to see this detail, especially arounds failure modes, closed connections, etc.

This section indicates that client should re-open connections after a small delay if they were closed in a successful situation. This section should consider persistent connections (i.e. keep-alive) and distinguish reply-level response codes from connection-level operations. For example, rather than saying "HTTP 200 OK responses with the right MIME type, however, should, when closed, be reopened after a small delay.", it may be better to say something like "The retrieval of an event-source that completes successfully (it has the correct content-type and an HTTP 200 OK response status code) should be tried again after a short delay." In addition, the client should continue to obey the appropriate cache-control response headers - this allows the server to dynamically influence the interval that the client retrieves future events (beyond a static value placed in other attributes on the event-source element). This would be useful in the HTTP 204 No Content situation described in this section as well.


Section "9.1.4. The event stream format" defines a new MIME type and new syntax for the browser to process. This section also has the most detail of the interaction between events and the application logic. I don't want to sound too negative, but this is the section that needs the most help. Several years ago, when I was at KnowNow, I spent a lot of time implementing browser based libraries and applications in JavaScript to do almost exactly what this definition of event-source describes. Since then, I've spent a lot of engineering time designing and building servers for large scale subscriptions and notifications at various companies. Doing that work taught me a lot about the balance between flexibility, ease of coding and efficiency.

There are three aspects of this event stream that I'm concerned about - the meaning of the event stream resource itself, the framing of individual messages within that event stream and how to handle re-processing the event stream in different situations (errors, page refresh by the user, etc).

The proposal I have is to consider the src= attribute on the event-source element as referencing a 'collection of messages'. As to the framing of individual messages, when retrieving this collection of messages via HTTP, I suggest using the multipart/mixed or multipart/digest MIME type. Individual parts can then have their own content-type and developers can decide which suits their needs - a simple name/value pair approach like form data (not my favorite), javascript object definitions like JSON, etc. See RFC 1341 for details.

I realize that one use-case or scenario is for mobile devices and message size is a concern. I think it may be possible to follow the pattern of multipart/mixed but create a terse syntax that follows the same capabilities of multipart/mixed with respect to compression (transfer-encoding, gzip, etc), formats (content-type) and localization (character-encoding).

For each message, the WhatWG event-source definition introduces specific names used for controlling routing to event handlers but it seems trivially easy to define an approach that isn't specific to this new syntax. Specifically, the Event and Target names could be replaced.

My proposal is to define the each message in the event streams to be similar to a request, except that no possibility of a response from the client exists. This provides for each message to have a URI that indicates it's target and a method that indicates the event type (post/put/delete).

For example, rather than an HTTP response of:

200 OK
Content-type: application/x-dom-event-stream
Content-length: NNNN
\n
Event: stock change\n
data: YHOO\n
data: -2\n
data: 10\n


Use a more generic event stream like

200 OK
Content-type: multipart/mixed; boundary=msg_boundary
\n
\n
--msg_boundary
POST /event-handler/stocks HTTP/1.0

Content-type: text/javascript-object
Content-length: NNNN
\n
{symbol: "YHOO", delta: -2, value: 10}
--msg_boundary
PUT /event-handler/stocks/YHOO HTTP/1.0
Content-type: text/javascript-object
Content-length: NNNN
\n
{delta: -2, value: 10}
--msg_boundary


There may be 'issues' with adding the request-line in a multipart/mixed response, but defining a 'multipart/message' might work for that. And if people don't like HTTP/1.0 in the request-line, we could define a 'simple notification protocol' as SNP/1.0

That last thing I'll mention has to do with re-processing the event stream. Re-processing an HTML page from start to end generally works. However, re-processing an event stream from start to end is generally not going to work well. The situations to consider are when the user manually refreshes the page and when the retrieval of the event stream completes successfully. When the user manually refreshes the page, the src= attribute will be the same as when the page was previously retrieved, but should the events be the same as previously retrieved, or should they be messages from that moment forward? And when the the retrieval of messages succeeds and the client waits a few moments before retrieving the next set from the very same URI, should that next set be the same messages again?

This is a tricky area and I'm sure there are several ways to approach this. I'll write more later...

2 comments:

Unknown said...

Hey Mike,

I'm not sure if you realized that Ian Hickson responded on the whatwg mailing list in February: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2007-February/009673.html

You may want to follow up. :)

Mike Dierken said...

Hi - yes, I know Ian responded on a list and I responded to that, but I haven't finished the thread... I got busy with my new startup company and haven't been able to spend any time on this area. (Other than build a simple messaging extension to Firefox for fun)