June 30, 2005

Government Open Code Collaborative

Wow - this is very cool. Sometimes the .gov can do something good for the people. Rhode Island Govtracker Services

"As a step towards resolving this dissonance, the Rhode Island Office of the Secretary of the State has released GovTracker Services to provide RESTful access to public information. We hope that this set of web services is adopted by the developer community and is a step towards an era of community-developed applications that make it as easy for citizens to interact with their government as it is for them to interact with the rest of the networked world."

June 28, 2005

Things fall apart, the centre cannot hold

Oh great, this just really cheers me up.

From Mark Cuban's Blog Maverick
Is this the start of a “Sarbanes Oxley” type environment for technology companies ? Will companies have to save and document everything they do in the marketing and promoting of their technologies ? Will they, or rather, should they video all presentations and record all phone calls ?

How else can we know that we are protected against unwarranted law suits that are used as competitive weapons to slow new technologies ?

I dont know how it will all turn out. Its probably not as bad as our worst nightmares, but there is the risk that it just might be.

This is the third freedom infringing red flag I've seen just today. First, it was seizing of private property, next it was criminalizing some software development and now it's the looming big chill of beauracracy. Time for some Yeats:

Turning and turning in the widening gyre
The falcon cannot hear the falconer;
Things fall apart; the centre cannot hold;
Mere anarchy is loosed upon the world,
The blood-dimmed tide is loosed, and everywhere
The ceremony of innocence is drowned;
The best lack all conviction, while the worst
Are full of passionate intensity.
Surely some revelation is at hand;
Surely the Second Coming is at hand.
The Second Coming! Hardly are those words out
When a vast image out of "Spiritus Mundi"
Troubles my sight: somewhere in sands of the desert
A shape with lion body and the head of a man,
A gaze blank and pitiless as the sun,
Is moving its slow thighs, while all about it
Reel shadows of the indignant desert birds.
The darkness drops again; but now I know
That twenty centuries of stony sleep
Were vexed to nightmare by a rocking cradle,
And what rough beast, its hour come round at last,
Slouches towards Bethlehem to be born?

-- William Butler Yeats, "The Second Coming"

The curse of the missing clause

Interesting comment from Ben Hammersley's Dangerous Precedent

While developers in the US are being hamstrung by their courts, and their counterparts in Europe are about to have software patents kick the chair out from under them, the developers in the warm and cheap places are getting busy. If you really care that your software was written in the US, then the Grokster case is quite a big deal. If not, you just shrug and move on. The rest of the world’s a big place. They make software there too.

ActiveGrid - Rapid Development

How original. ActiveGrid - unified XML interface to enterprise information. Now where did I put those slides from '99 back at DataChannel? Well, maybe somebody will make it work this time...

The ActiveGrid Application Builder represents all data sources as XML Web Services, while still using native connectivity to access the data sources. From a developer's perspective, the metadata for all of the data sources is the XML Schema standard. Methods and services invoked for the data sources appear to the developer to be Web Services, even if they are actually databases or legacy systems. Unifying all data sources into a standard development methodology increases developer productivity and simplifies integration tasks.

June 14, 2005

Steve: Developing on the Edge

This is an interesting little paper re-examining Java frameworks for SOAP - there's this second wave of Web Service developers that are starting from scratch and doing the remote-object thing... this post tries to direct them to a more direct and managable approach:
Developing on the Edge:
I like to view JAX-RPC as the EJB of Web Services. EJB was designed to hide all the detail of persistence and distribution from those little developers, who didnt need to worry about such things. JAX-RPC was also designed to hide all the detail of serialization and distribution from those developers. Everyone knows that EJB is wrong, Ed and I are just pointing out that the emperor's clothes are equally sparse when it comes to XML messaging.

June 11, 2005

WhatWG and the event-source design

The WhatWG is a set of individuals and reprentatives of 'browser manufacturers' that are defining the evolution of HTML markup to help Web application developers. They call themselves 'unofficial', but most people realize that rough consensus and working code can create more momentum than 'official' channels.

They have recently described an extension to HTML allows a Web browser to receive a stream of events from a server - the event-source tag. This is a wonderful concept and very similar to work done by KnowNow, mod-pubsub and others over the past few years. From my experience with these other approaches, I know that truly amazing things will happen when this becomes commonplace and trivial to use.

I had looked at the WhatWG when it first started, but soon stopped following their development. From reading the current draft of their description of this HTML extension, I can forsee several technical issues and now I wish I had stuck with monitoring their development in order to contribute to this section. Hopefully it's not too late.

There are several technical issues and design approaches that are interesting, but I can't find much background or discussion of these alternatives, so I may be covering old ground with this post.

The different areas I am interested in are:
- connecting elements, event streams and event handlers
- format and definition of event streams

1) connecting elements, event streams and event handlers
From the section "9.1.1. The event-source element" the approach is to introduce a new element. I'd like to consider whether more than one event-source element is allowed, and also consider whether simply introducing new attributes on existing elements is easier and feasible.
If a document is allowed to have more than one event-source element then the client app will have to deal with either multiple connections or combining multiple event-sources on one connection while delivering events to the corresponding event-source element event handler. If multiple connections are supported, then the client application could saturate the capability of the client machine and could even be considered a 'poorly behaved' client on the shared network.

An alternative to a new element would be to add new attributes, for example:
<p event-src="/my/stocks/amzn/" onMessage="handleQuotes()" />

I suggest using onMessage rather than onEvent to distinguish between network based messages and application based events (e.g. connection-opened, connection-closed-by-client, connection-closed-by-server, etc). I realize that some people feel web developers would want these to look identical, but many years of experience across the software industry has shown that they are simply different beasts and making that explicit actually helps the developer. The onEvent handler could be used for connection events on the event-source outside the actual messages within that stream.

2) format and definition of event streams
Section "9.1.3 Processing model" defines how connections should be handled, and it's great to see this detail, especially arounds failure modes, closed connections, etc.

This section indicates that client should re-open connections after a small delay if they were closed in a successful situation. This section should consider persistent connections (i.e. keep-alive) and distinguish reply-level response codes from connection-level operations. For example, rather than saying "HTTP 200 OK responses with the right MIME type, however, should, when closed, be reopened after a small delay.", it may be better to say something like "The retrieval of an event-source that completes successfully (it has the correct content-type and an HTTP 200 OK response status code) should be tried again after a short delay." In addition, the client should continue to obey the appropriate cache-control response headers - this allows the server to dynamically influence the interval that the client retrieves future events (beyond a static value placed in other attributes on the event-source element). This would be useful in the HTTP 204 No Content situation described in this section as well.

Section "9.1.4. The event stream format" defines a new MIME type and new syntax for the browser to process. This section also has the most detail of the interaction between events and the application logic. I don't want to sound too negative, but this is the section that needs the most help. Several years ago, when I was at KnowNow, I spent a lot of time implementing browser based libraries and applications in JavaScript to do almost exactly what this definition of event-source describes. Since then, I've spent a lot of engineering time designing and building servers for large scale subscriptions and notifications at various companies. Doing that work taught me a lot about the balance between flexibility, ease of coding and efficiency.

There are three aspects of this event stream that I'm concerned about - the meaning of the event stream resource itself, the framing of individual messages within that event stream and how to handle re-processing the event stream in different situations (errors, page refresh by the user, etc).

The proposal I have is to consider the src= attribute on the event-source element as referencing a 'collection of messages'. As to the framing of individual messages, when retrieving this collection of messages via HTTP, I suggest using the multipart/mixed or multipart/digest MIME type. Individual parts can then have their own content-type and developers can decide which suits their needs - a simple name/value pair approach like form data (not my favorite), javascript object definitions like JSON, etc. See RFC 1341 for details.

I realize that one use-case or scenario is for mobile devices and message size is a concern. I think it may be possible to follow the pattern of multipart/mixed but create a terse syntax that follows the same capabilities of multipart/mixed with respect to compression (transfer-encoding, gzip, etc), formats (content-type) and localization (character-encoding).

For each message, the WhatWG event-source definition introduces specific names used for controlling routing to event handlers but it seems trivially easy to define an approach that isn't specific to this new syntax. Specifically, the Event and Target names could be replaced.

My proposal is to define the each message in the event streams to be similar to a request, except that no possibility of a response from the client exists. This provides for each message to have a URI that indicates it's target and a method that indicates the event type (post/put/delete).

For example, rather than an HTTP response of:

200 OK
Content-type: application/x-dom-event-stream
Content-length: NNNN
Event: stock change\n
data: YHOO\n
data: -2\n
data: 10\n

Use a more generic event stream like

200 OK
Content-type: multipart/mixed; boundary=msg_boundary
POST /event-handler/stocks HTTP/1.0

Content-type: text/javascript-object
Content-length: NNNN
{symbol: "YHOO", delta: -2, value: 10}
PUT /event-handler/stocks/YHOO HTTP/1.0
Content-type: text/javascript-object
Content-length: NNNN
{delta: -2, value: 10}

There may be 'issues' with adding the request-line in a multipart/mixed response, but defining a 'multipart/message' might work for that. And if people don't like HTTP/1.0 in the request-line, we could define a 'simple notification protocol' as SNP/1.0

That last thing I'll mention has to do with re-processing the event stream. Re-processing an HTML page from start to end generally works. However, re-processing an event stream from start to end is generally not going to work well. The situations to consider are when the user manually refreshes the page and when the retrieval of the event stream completes successfully. When the user manually refreshes the page, the src= attribute will be the same as when the page was previously retrieved, but should the events be the same as previously retrieved, or should they be messages from that moment forward? And when the the retrieval of messages succeeds and the client waits a few moments before retrieving the next set from the very same URI, should that next set be the same messages again?

This is a tricky area and I'm sure there are several ways to approach this. I'll write more later...

WhatWg and HTML 5 <event-source>

From the Ajaxian Blog: HTML 5 vs. XHTML 2:

Event Sources: <event-source src='/some/path' onevent='process(event)'/> rather than a lot of JavaScript and iframes.

To specify an event source in an HTML document authors use a new (empty) element event-source, with an attribute src="" that takes a URI (or IRI) to open as a stream and, if the data found at that URI is of the appropriate type, treat as an event source.

Whoa! I expected cool things from WhatWG, but 'event-source' with a URI as the source? I haven't seen anyone mention that since the KnowNow days... dang, I hope this really makes it out to the public!

June 08, 2005

Not: Ecommerce sites use personal info to charge you more

This is a somewhat misleading snippet claiming that
Ecommerce sites use personal info to charge you more. I don't know about other sites but I'm pretty sure Amazon has no such pricing capability. I'm an engineering manager responsible for the software services that manage all offers and prices at Amazon. Often products have multiple sellers each offering the item at different prices - these show up in the 'buy box' and the featured offer is rotated periodically. This might explain why a price changed. Also, sellers change their prices all the time. Also, one seller can go out of stock, and the next seller's offer would be featured and their price would be used. Many things can cause the price of a product to change.

Whatever happened to investigative journalism?

June 02, 2005

Adam on Ajax

A little bit of a retrospective on dynamic pages and Ajax from Bosworth - Ajax reconsidered.

Dig this - network events into the browser... who'd a thunk it? (Other than the half dozen companies doing it over the past five years.)

Secondly, the browser isn't a good listener to external events. If you want to build an application, for example, to show you instantly when someone bids or a price changes, it is hard. You can poll, but poll too frequently and the application starts to feel sluggish and it isn't easy to do this. What you really want is an event driven model where in addition to the events like typing the page can describe events like an XMPP message or a VOIP request or a data-changed post for an ATOM feed.