Kinetic: Faster MySQL B-Tree performance

In our ongoing quest for low-cost, high-performance solutions for our platform I've run across TokuDB from TokuTek. This is a MySQL storage engine that uses a different indexing technology that makes updates to an index faster. Making index maintainence faster means being able to have more indices and that allows for richer queries and more interesting applications and analytics.

For one part of our system, I'm interested simply in faster inserts to the index. Having more indices for each table would help in other areas but I haven't started working on those yet. The table in question manages a many-to-one mapping of tokens (text) to internal database row IDs (integers). Essentially, it's a simple key/value lookup table. Currently a single machine does 150 inserts per second, which is fairly abysmal for a table with only 200M records.
This system uses MySQL and the InnoDB storage engine which uses a B-Tree indexing approach. Since the lookups are simply key/value lookups there really is no need for a B-Tree - a hash indexing scheme would work, but unfortunately InnoDB does not support that. We may migrate to Tokyo Tyrant and Tokyo Cabinet since that supports hash indexing and also seems to have better concurrency support (many non-blocking concurrent requests). The keys inserted are almost sequential, but from what I've read the sequential nature of the inserts should help (although the bottom-up building of a B-tree could be optimized by detecting that the key causing a page split is greater than all the values in the page being split - I think Oracle does this).

While looking for hash based indexing for MySQL I found TokuDB which uses the sexy phrase 'fractal tree' for their indexing. Their site has a few introductory whitepapers but a for deeper understanding I wanted to get to the theoretical foundation of their technology.

I soon found that there are several meanings to 'fractal tree' data indexing

fpb+ tree. Fractal pre-fetching b+ trees, embeds cache-optimized trees within disk-optimized trees. Unrelated to TokuDB
Fractal Trees (TM) from TokuTek. Uses 'cache oblivious' algorithms to improve B-tree usage.

One of their founders was kind enough to post some links on the mostly useful MySQL performance blog - here are the links (however, a few seem to be missing)

The core algorithms seem to be for 'cache oblivious' b-tree structures, based on a description from a masters thesis from 1999. Maybe not all things 20th century suck (postscript files excluded).
Although I haven't finished reading all the papers, what I like about this approach is the way they model algorithm performance as the cost to transfer blocks of data between layers of storage and also that they consider multiple layers. The most important transfer costs being between disk and memory, but including considerations like an OS managed disk cache is good. This models the multi-layer caching that nearly all large scale systems use. We are big fans of measuring algorithm performance and selecting the right tool for the right job.

As an aside, one of the comments on the MySQL performance blog pointed me to InfoBright, a column-oriented store, which might be useful in some of our analytics systems for ad-hoc reporting.

Kinetic

August 02, 2009

Faster MySQL B-Tree performance

No comments: