Skip to content
apavlo edited this page May 31, 2011 · 1 revision

Authors

Yang Lu

Li Jin

Implementation

Hypertable uses multi-dimensional tables. Here we are using the Hypertable tables just as rela- tional database tables. The rowkeys of these Hypertable tables are constructed by concatenating primary keys of each table to make an one-to-one mapping from the tpcc rows to Hypertable rows. Therefore, there is no data denormalizing in the implementation. Tables partition is done automat- ically. By default, Hypertable will split a range (a continuous fraction of table) when it grows too large(256MB by default). We don't implemented transaction in our driver since it is not supported by Hypertable. Also, we didn't do any optimizations/short-cuts.

Driver Dependencies

Known Issues

The very major part of our execution time is spend running Hypertable API functions. For in- stance, it took 5-6 seconds to get 3000 customers from the custom table(Given the w id and d id). We have posted the throughput of Hypertable (around 10k bytes/second for a single thread) on the Hypetable User Google Group however the answer we got so far was just ”it seems slow”. Even for NewOrder, where every query has got the rowkey, the speed is just 5-6 tps.

Future Work

  • We are using HQL interface(a query language similar to SQL) to do the query in Hypertable. A possible improvement would be use scanner instead, although the scanner is mostly suggested for a full table scann at this moment.
  • We are using a merge join in our code. Some other merge algorithm might be better.
  • Data denormalization. We didn't do any of that currently.