Skip to content

Tokyo Cabinet Driver

apavlo edited this page May 30, 2011 · 1 revision

Authors

Marcelo Martins

Implementation

This driver implements the TPC-C benchmark for Tokyo Cabinet (TC) using the table-database type and remote, concurrent connections via Tokyo Tyrant. The table-database type is used, despite the existence of faster data structures, since it fits pretty well the relational data model proposed by the TPC-C benchmark. In addition, queries are easy to implement. This choice does not come for free; there is a great performance loss when when relying on a table data structure, instead of a tree or hash table.

Partitioning is done manually based on the warehouse ID parameter. Since the ITEM records do not contain a warehouse-ID, it is necessary to replicate the ITEM table to all servers. This can be easily done and does not affect queries, especially considering that the ITEM table is not updated on this benchmark.

No normalization was used, although I should have considered it since TC's performance is abismal. I leave this as future work. Finally, some of the queries do not necessarily return all the columns dictated by the TPC-C benchmark specification, but only those that are of interest to other queries of the same transaction or that are returned at the end of a transaction.

Transaction internal operations were reorganized in two ways:

  • To optimize roundtrip travels to the server, we reorganized queries so that those which depended on the same SELECT results would run sequentially without redundant server requests
  • Updates are performed last to reduce inconsistencies. Multi-partition transactioning is not supported by TC and implementing it as application code is too cumbersome.

Driver Dependencies

The following packages must be installed before the driver can be used:

  • Tokyo Cabinet
  • Tokyo Tyrant
  • pyrant

Known Issues

If you run the code as it is, you will realize that TC's performance on TPC-C benchmark is disappointing. There are few reasons for this, all related to the current implementation. First of all, the table-database type implies persistence, which incurs distributed I/O overhead. Many of the promises of swift DB access are shattered due to a slow data type with persistence. This might be one of the reasons why KC dropped support for table databases.

Second, the third-party library (pyrant) used to access Tyrant servers seem to be inefficient. Query support is dumb; there is no way to limit the number of returned records, which means that some of the queries that only requires one-record responses return way more data over the network, adding more delay to transactions. The Stock transaction is even worse: one has to trigger two separate queries, one for each table, and then merge the results. Also, updates always require a SELECT first.

Finally, there are non-deterministic, temporary locks that prevents connection resuming to the DB ports. Therefore, most of the transactions show non-deterministic delays, especially the PAYMENT and STOCK transactions. I believe these are due to the lack of polishness of the pyrant library, or maybe the way I am using it. As proof, I tried to connect to TC directly using a memcached library and there were no delays.

Future Work

  • Like dbm, each database in TC and KC is a table. Being this way, and contrary to other NoSQL databases, one must open a network connection to each table, not each database. Since we have parallel, independent connections to the same database, I could easily parallelize independent queries belonging to the same transaction and make it run faster.

  • Drop pyrant, write a custom Python library. Better yet, forget about table databases. TC has support for in-memory DBs; I should have used them from the beginning . Writing my own table encapsulation function to support multiple columns as values is not difficult at all. I would have done so had I known that pyrant and table DBs are so slow.

  • I should have used some sort of ???. The only good thing I noticed is that loading times for TC are better than for all other DBs (at least for those groups that have reported results up to now).