Skip to content

Commit

Permalink
Use custom sharding key to calculate bucket id
Browse files Browse the repository at this point in the history
Previously there were two different ways to obtain bucket id in CRUD:

- calculate bucket id automatically using primary key (default)
- pass it from outside explicitly in options on CRUD operation call

Users who uses DDL module [1] may specify sharding key (that are
actually names of tuple fields), but it was not possible to use DDL
sharding key for bucket id calculation. Now CRUD allows to use that
custom sharding key to calculate bucket id, it will be done
automatically when used DDL schema with non-empty sharding_key [1] or
when space _ddl_sharding_key contains a tuple with space name and it's
sharding key.

Table below describe what operations supports custom sharding key:

| CRUD method                  | Added sharding key support |
| ---------------------------- | -------------------------- |
| get()                        | Yes                        |
| insert() / insert_object()   | Yes                        |
| delete()                     | Yes                        |
| replace() / replace_object() | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| select() / pairs()           | Yes                        |
| update()                     | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| replace() / replace_object() | Yes                        |
| min() / max()                | No (not required)          |
| cut_rows() / cut_objects()   | No (not required)          |
| truncate()                   | No (not required)          |
| len()                        | No (not required)          |

Limitations:

- It's not possible to update sharding keys automatically when schema is
  updated on storages, see [2]. However it is possible to do it manually with
  sharding_key.update_sharding_keys_cache().
- CRUD select may lead map reduce in some cases, see [3].

1. https://github.com/tarantool/ddl
2. #212
3. #213

Closes #166
  • Loading branch information
ligurio committed Sep 17, 2021
1 parent aa9e34c commit a699510
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 4 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
* `crud.len()` function to calculate the number of tuples
in the space for memtx engine and calculate the maximum
approximate number of tuples in the space for vinyl engine.
* CRUD operations calculates bucket id automatically using sharding
key specified with DDL schema or in `_ddl_sharding_key` space.

## [0.8.0] - 02-07-21

Expand Down
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,12 @@ crud.unflatten_rows(res.rows, res.metadata)

* A space should have a format.
* By default, `bucket_id` is computed as `vshard.router.bucket_id_strcrc32(key)`,
where `key` is the primary key value.
Custom bucket ID can be specified as `opts.bucket_id` for each operation.
For operations that accepts tuple/object bucket ID can be specified as
tuple/object field as well as `opts.bucket_id` value.
where `key` is a value of sharding key. Sharding key can be set with [DDL
schema](https://github.com/tarantool/ddl#input-data-format) (see
`sharding_key` option) or in a space `_ddl_sharding_key`. By default sharding
key is a primary key. Custom bucket ID can be specified as `opts.bucket_id`
for each operation. For operations that accepts tuple/object bucket ID can be
specified as tuple/object field as well as `opts.bucket_id` value.

### Insert

Expand Down

0 comments on commit a699510

Please sign in to comment.