January 19, 2008

Stonebraker on MapReduce

I've too busy lately to post on all the exciting things happening in the database world - more people getting into column-oriented storage, Amazon's SimpleDB service, consumer database Web services like blist and LongJump, Sun buying MySQL. But I couldn't pass up commenting on Joe Gregorio's post on Stonebraker on MapReduce.
It seems everybody is panning Stonebraker's evaluation of MapReduce as incorrectly comparing it to a DBMS. One commenter even said (ironic comment of the year) Michael Stonebraker should learn what a DMBMS is.

The point, which I'm sure someone has pointed out, can be found by looking at the summary of their findings. Here are a few:
- A sub-optimal implementation, in that it uses brute force instead of indexing
- Missing most of the features that are routinely included in current DBMS
- Incompatible with all of the tools DBMS users have come to depend on

These read like a quote from The Innovator's Dilemma. People enjoy the benefits of MapReduce for large scale data processing because of these points. The lack of support for these DBMS features and tools are the reason it scales like a mother fucker. That is the feature people want. And it does it 10x better and cheaper than anything else. Those other things simply don't matter to them. It's a new audience and a new market.