November 25, 2004

Jeff Barr says Comment Spammers are Scum!

I was shocked to discover that hidden evildoers are injecting chaos into the blogosphere - horrors! From Jeff Barr's Blog

Comment Spammers are Scum! Every time something new and cool is invented, jerks like them have to pollute it and ruin it for everyone else.


There is a general principle here that applies to every single information system built - concentrations of a scarce resource will attract scavengers (or innovators, depending on your viewpoint). In the digital world, information is not scarce and data transfer has a low cost. The scarce resource is time - in this case, people's attention. People pay attention to their email inbox - which is essentially a collection of unsolicited messages. People also pay attention to interesting news - collections of weblog entries and their comments.

Who's next on the list? Let's look at the current popular Web applications:

  • Flickr - shared, tagged images. The tags are used to generate collections of interesting images. People pay attention to interesting things. Where attention gathers, so do attention scavengers. For example, I like to look at images of Hawaii - how soon until I see travel agent ads?

  • del.icio.us - shared, tagged links. The tags are used to generate collections of interesting links. You can just feel the scavengers circling.

  • amazon.com - shared product reviews. People buy things. They read reviews. Spam ensues.

  • search based alerts - see Google Search Alerts, pubsub.com, searchalert.net and many more. Just like email, but this time the spammers don't need email addresses they only need the popular keywords



Tag based categorization schemes are beautiful and simple, but resemble a namespace land-grab. What can we do about it? It seems the tried and true 'democracy in action' works well for Amazon and might be applied to other systems. Basically this means votes. For example, with flickr images, do I really believe in anyone's claim that their image really represents Hawaii? If several other people suggest it might be an image related to Hawaii, I'd pay more attention. If several other people suggest instead that it is porn or spam, my software agents can skip over it. If flickr let people add tags to other images it would help introduce new items to collections that the image provider didn't think of and allowing negative votes would help remove innapropriate items from a collection. Knocking items out of a collection may sound bad, but with 2000 images of Hawaii on a fledgling service like flickr, I don't think we'll run short of information anytime soon.

Transactions and Activity Services

Interesting bit of history of the WS-Coordination framework from Mark Little's WebLog about the J2EE Activity Service.

This defines an infrastructure to support a wide range of extended transaction models. The architecture is based on the insight that the various extended transaction models can be supported by providing a general purpose event signaling mechanism that can be programmed to enable activities (application specific units of computations) to coordinate each other in a manner prescribed by the extended transaction model under consideration.


Web Services are not Distributed Objects

I'm not sure if I posted this article by Werner (Web Services are not Distributed Objects) or not, but this sentence makes me cringe:

The REST principles are relevant for HTTP bindings, and for the Web server's parsing of resource names, but they are useless in the context of TCP or message-queue bindings where the HTTP verbs do not apply.


The principles of REST apply to any large-scale networked application. I wonder if he has modified his position on this.

It's Just Messages (Sort of)

It's interesting that TimB is detecting the surge of message-ness in everything Internet lately. Obviously he's been involved in many software systems and knows lots about messaging and when his radar picks up on something it usually means massive momentum in this area is underway.
Of course, lots of people have been saying this - my blog is sub-titled 'Messages Bouncing Around' for a reason - as well as actually building software and designing systems with just this in mind.

I enjoy building software and designing systems around message based approaches, but I also enjoy talking about the 'big picture' of messaging - so forgive me for the following ramblings.

Undeniably messages are everywhere, but when Tim Bray talks about 'a feed is just a little bundle of messages' I have to mention that a document (or document fragment) does not equal a message. I always associate an operation with the blob of data - for example, a feed could mean 'please remove from your site entries that match exactly the following items...'. Or it could mean 'please add to your site the following items...'. And so on.

Tim goes on to think about what happens when everything is a message and everything is asynchronous. Look to the Web for an example of a system where everything is a message. As for asynchronous, that means different things to different people. One aspect of asynchronous is 'unsolicited requests'. Email is a set of unsolicited messages (demanding your attention, which is a scarce resource). Requests into a Web server are unsolicited - they happen at any time, usually not of your choosing.

Some scenarios where people usually desire 'asynchronous' messaging on the Web are long running operations, disconnected use and events to the desktop (or palmtop or phone, etc).

Long running operations and disconnected use are somewhat similar. The need is for a reply to be delivered in the face of a network connection failing. This requires correlation between a request and a reply. The Web uses the network connection to implicity do that correlation - but there are hooks to grow into other approaches, they just haven't been widely used or standardized (see the '202 Accepted' HTTP response).

Delivering messages to the client means allowing unsolicited requests to be sent to the client machine. This means putting a Web server on the desktop. An example of this is Google's Desktop Search. Once Google opens this up through the firewall, you will see Blogger notifications and Google Search Alert notifications streaming to your desktop via HTTP and the Google Web server they've sneakily installed. People trust them and will continue to allow Google to expand their footprint on the desktop. The sky's the limit when you own the ground people walk on.

Unfortunately, allowing unsolicited requests has it's downside. Web servers get this in terms of denial-of-service attacks, and people get this in terms of e-mail spam. Authentication will delay this a little bit, but there will always be a constant battle between letting information find you and hiding from spammers.

November 21, 2004

Generation Tech

This is awesome and hilarious.

Boing Boing: Tech-support generation spends Thanksgiving patching for parents

Forget the generational tags you’ve already heard, like Gen X and Gen Y. We are the Tech-Support Generation. Our job is to troubleshoot the complex but imperfect technology that befuddle mom and dad, veterans of the rotary phone, the record player and the black-and-white cabinet television set. Next week, on our annual pilgrimage home, we’ll turn our Web-trained minds and joystick-conditioned fingers to the task of rescuing our parents from bleeding-edge technology on the blink.

November 15, 2004

Amazon Offer Listing Service

Now this is a sweet looking job description at Amazon! If you act now, you could even have the chance to work with me!

Amazon.com Jobs: Offer Listing Service Team

Amazon.com has immediate openings for talented and enthusiastic senior software engineers to design, create and support our Offer Listing Services. Our team owns the business critical systems that help sellers manage and maintain everything for sale worldwide on the Amazon.com selling platform. As a senior software engineer with the Offer Listing team, you will work with software engineers, program managers, and other team members to create highly scalable, reliable and distributed applications and services for various aspects of Amazon's global selling platform. You will be responsible for real-time operational support of the team's functional areas as well as brainstorming features and ideas. You will design, document and implement scalable, flexible features using C++, Java, XML, SQL and CORBA in a Unix environment. The ability to document technical approaches completely, correctly and concisely is required. Candidates must be innovative, creative, flexible, self-directed, and understand how to design and write high-performance, reliable and maintainable code.

November 10, 2004

Amazon Theater

I love the creative approach to 'marketing' at Amazon. Rather than spend bucks on SuperBowl TV ads, it's much more cool to get excellent directors to make some movies. Just for fun.

This holiday season Amazon.com brings you Amazon Theater, an exclusive collection of five short films. This week's film, ''Portrait,'' is a comedy about a domineering fashionista (Minnie Driver) who learns a lesson about inner beauty--the hard way, of course. To see the film, click one of the links below or visit the ''Portrait'' page, where you'll also find download options and products featured in the film.

Portrait -- High Bandwidth | Low Bandwidth (Requires Windows Media Player)


Hmm, I wonder what the 'mms:' protocol is. Hmm, available only in Windows Media Player?

November 05, 2004

Amazon Simple Queue - Dequeue Operation

So how do these supposedly 'RESTful' Amazon Web services map to HTTP?
For example, what does the 'Dequeue' operation (what an awful name) use as an HTTP method? I hope they don't use the GET method, but I could find no documentation. Time to check into what kind of design reviews I can crash.

Amazon Web Services (AWS) SDK - Dequeue Operation

November 03, 2004

Amazon Queue service

Well, isn't this interesting:

You can use the Amazon Simple Queue Service to better manage
messages between components of your distributed
applications. SQS allows you to decouple components and make
them run independently. Any component of a distributed
application can store any type of data in a reliable queue at
Amazon.com.


Currently free, but once officially released it'll cost but at some reasonable price.

November 01, 2004

Best Song Right Now

This is my most favorite song right now:
Alison Krauss & Union Station : The Lucky One.

Amazon has a direct MP3 dowload here.

Perfect information

Today was Amazon's quarterly company meeting in Seattle. I don't go to every one, but thought I should attend this one since a team member was due to be given a Just-Do-It award. These awards are for people that go above and beyond the normal call of duty to create a solution to some problem without being asked and without asking permission - just do it. It's a really cool testament to the pervasive culture at Amazon of competency, self-reliance and taking responsibility for our company. Anyway, somebody on my team built a database viewing tool that's very easy to use to browse tables in one of our back-end services. Engineers use this when investigating production issues and many other groups have taken to using it to investigate their production issues as well. Since this browsing of SQL tables results in URLs, these links tend to make their way into email discussions, trouble tickets, bug reports, etc. The power of REST at work... gotta love it!

One of the intriguing aspects of this quarter's meeting was Bezos talking about 'perfect information' and 'transparent business'. If I understand it correctly, perfect information has to do with the inescapable flow of information and how a business should align themselves with this natural tendency rather than try to fight it. Comparison shopping is an example where consumers want to look at competitive prices and no amount of hiding the truth is going to help - somebody somewhere is going to invent a simple utility that makes it easy. There are already wireless gadgets that help you look up UPC codes on comparison shopping sites - even checking availability at Amazon - to help you make an informed buying decision. The great thing about Amazon is that we recognize that this is a Good Thing and also that we are lined up and positioned to benefit from this, while many other retailers aren't.

I didn't get the chance to ask whether 'transparent business' meant that Amazonians should open up and start blogging, but I'll just take the hint and Just Do It.

Another interesting thing coming out of this employee meeting was the kind of cool stuff Amazon is going to do over the next two months on the retail sites around the world. We've always had a creative approach to marketing. For example, rather then spend a huge chunk of cash on legacy marketing tactics - tv ads - Amazon plows that money back toward the customer via Free Shipping. Everybody I know that shops at Amazon recognizes that this is a more effective means of 'advertising' than advertising. This year Amazon has put some money into other creative forms of connecting with consumers over the Internet - not at all a 'sales/marketing' type of thing. I think people will enjoy it and hopefully there will be a lot of discussion around the Web in the weeks to come. Take a look at the site next Monday & see if it's up there yet.