From 5b73aaab5c3859bfcfefcad3bc80438ed912dd27 Mon Sep 17 00:00:00 2001 From: Sergey Bronnikov Date: Thu, 16 Sep 2021 11:37:33 +0300 Subject: [PATCH] Doc: use custom sharding key to calculate bucket id Describe functionality and current limitations (#212 and #213) with custom sharding key in CHANGELOG and README. Closes #166 --- CHANGELOG.md | 4 ++++ README.md | 56 ++++++++++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 56 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 68c09aced..be1752bc1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -24,6 +24,10 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. * `crud.len()` function to calculate the number of tuples in the space for memtx engine and calculate the maximum approximate number of tuples in the space for vinyl engine. +* CRUD operations calculates bucket id automatically using sharding + key specified with DDL schema or in `_ddl_sharding_key` space. + NOTE: CRUD methods delete(), get() and update() requires that sharding key + must be a part of primary key. ## [0.8.0] - 02-07-21 diff --git a/README.md b/README.md index 29ae1145b..59637b0be 100644 --- a/README.md +++ b/README.md @@ -54,10 +54,58 @@ crud.unflatten_rows(res.rows, res.metadata) * A space should have a format. * By default, `bucket_id` is computed as `vshard.router.bucket_id_strcrc32(key)`, - where `key` is the primary key value. - Custom bucket ID can be specified as `opts.bucket_id` for each operation. - For operations that accepts tuple/object bucket ID can be specified as - tuple/object field as well as `opts.bucket_id` value. + where `key` is a value of sharding key. Sharding key can be set with [DDL + schema](https://github.com/tarantool/ddl#input-data-format) (see + `sharding_key` option) or in a space `_ddl_sharding_key`. By default sharding + key is a primary key. Custom bucket ID can be specified as `opts.bucket_id` + for each operation. For operations that accepts tuple/object bucket ID can be + specified as tuple/object field as well as `opts.bucket_id` value. + NOTE: CRUD methods delete(), get() and update() requires that sharding key + must be a part of primary key. + +**Sharding key** + +*Sharding key* is a set of tuple field values used for calculation *bucket ID*. +*Bucket ID* determines which replicaset stores certain data. Function that used +for calculation bucket ID is named *sharding function*. + +By default CRUD calculates bucket ID using primary key, it happen automatically +and doesn't require any actions from user side. User can calculate bucket ID on +outside and pass it as an option to CRUD method (see below). + +Users who don't want to use primary key as a sharding key may set custom +sharding key definition as a part of DDL schema or manually to the space +`_ddl_sharding_key` used by DDL module too (for both cases consider a DDL +module documentation). As soon as sharding key for a certain space is available +in `_ddl_sharding_key` space CRUD will use it for bucket ID calculation +automatically. + +Table below describe what operations supports custom sharding key: + +| CRUD method | Added sharding key support | +| -------------------------------- | -------------------------- | +| `get()` | Yes | +| `insert()` / `insert_object()` | Yes | +| `delete()` | Yes | +| `replace()` / `replace_object()` | Yes | +| `upsert()` / `upsert_object()` | Yes | +| `select()` / `pairs()` | Yes | +| `update()` | Yes | +| `upsert()` / `upsert_object()` | Yes | +| `replace() / replace_object()` | Yes | +| `min()` / `max()` | No (not required) | +| `cut_rows()` / `cut_objects()` | No (not required) | +| `truncate()` | No (not required) | +| `len()` | No (not required) | + +Current limitations for using custom sharding key: + +- It's not possible to update sharding keys automatically when schema is +updated on storages, see [#212](https://github.com/tarantool/crud/issues/212). +However it is possible to do it manually with +`sharding_key.update_sharding_keys_cache()`. +- CRUD select may lead map reduce in some cases, see +[#213](https://github.com/tarantool/crud/issues/213). ### Insert