From f4b922205112ec5f6ef05804b0d264dc90d625cb Mon Sep 17 00:00:00 2001 From: Sergey Bronnikov Date: Thu, 16 Sep 2021 11:37:33 +0300 Subject: [PATCH] Doc: use custom sharding key to calculate bucket id Describe functionality and current limitations (#212, #213 and #219) with custom sharding key in CHANGELOG and README. Closes #166 Reviewed-by: Oleg Babin Reviewed-by: Alexander Turenko Co-authored-by: Georgy Moiseev --- CHANGELOG.md | 5 +++++ README.md | 59 +++++++++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 59 insertions(+), 5 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index f781e3098..eb9cad0e7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,11 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. ### Added +* CRUD operations calculates bucket id automatically using sharding + key specified with DDL schema or in `_ddl_sharding_key` space. + NOTE: CRUD methods delete(), get() and update() requires that sharding key + must be a part of primary key. + ### Changed ### Fixed diff --git a/README.md b/README.md index 29ae1145b..bd52ec740 100644 --- a/README.md +++ b/README.md @@ -53,11 +53,60 @@ crud.unflatten_rows(res.rows, res.metadata) **Notes:** * A space should have a format. -* By default, `bucket_id` is computed as `vshard.router.bucket_id_strcrc32(key)`, - where `key` is the primary key value. - Custom bucket ID can be specified as `opts.bucket_id` for each operation. - For operations that accepts tuple/object bucket ID can be specified as - tuple/object field as well as `opts.bucket_id` value. + +**Sharding key and bucket id calculation** + +*Sharding key* is a set of tuple field values used for calculation *bucket ID*. +*Sharding key definition* is a set of tuple field names that describe what +tuple field should be a part of sharding key. *Bucket ID* determines which +replicaset stores certain data. Function that used for bucket ID calculation is +named *sharding function*. + +By default CRUD calculates bucket ID using primary key and a function +`vshard.router.bucket_id_strcrc32(key)`, it happen automatically and doesn't +require any actions from user side. User can calculate bucket ID on outside and +pass it as an option to CRUD methods that accepts tuple or object (see option +`bucket_id` below). + +Starting from 0.10.0 users who don't want to use primary key as a sharding key +may set custom sharding key definition as a part of [DDL +schema](https://github.com/tarantool/ddl#input-data-format) or insert manually +to the space `_ddl_sharding_key` (for both cases consider a DDL module +documentation). As soon as sharding key for a certain space is available in +`_ddl_sharding_key` space CRUD will use it for bucket ID calculation +automatically. Note that CRUD methods `delete()`, `get()` and `update()` +requires that sharding key must be a part of primary key. + +Table below describe what operations supports custom sharding key: + +| CRUD method | Sharding key support | +| -------------------------------- | -------------------------- | +| `get()` | Yes | +| `insert()` / `insert_object()` | Yes | +| `delete()` | Yes | +| `replace()` / `replace_object()` | Yes | +| `upsert()` / `upsert_object()` | Yes | +| `select()` / `pairs()` | Yes | +| `update()` | Yes | +| `upsert()` / `upsert_object()` | Yes | +| `replace() / replace_object()` | Yes | +| `min()` / `max()` | No (not required) | +| `cut_rows()` / `cut_objects()` | No (not required) | +| `truncate()` | No (not required) | +| `len()` | No (not required) | + +Current limitations for using custom sharding key: + +- It's not possible to update sharding keys automatically when schema is +updated on storages, see [#212](https://github.com/tarantool/crud/issues/212). +However it is possible to do it manually with +`require('crud.sharding_key').update_sharding_keys_cache()`. +- CRUD select may lead map reduce in some cases, see +[#213](https://github.com/tarantool/crud/issues/213). +- No support of JSON path for sharding key, see +[#219](https://github.com/tarantool/crud/issues/219). +- `primary_index_fieldno_map` is not cached, see +[#243](https://github.com/tarantool/crud/issues/243). ### Insert