-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
doc: describe schema reload implementation
This patch adds a small document describing how space and sharding schemas works in the module. It covers what schemas are, how we store, use and reload them in requests. Closes #253
- Loading branch information
1 parent
d053623
commit 2ce9fe6
Showing
1 changed file
with
155 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
# Database schema information design document | ||
|
||
Two types of schema are used in ``crud`` requests: ``net.box`` spaces | ||
schema and ``ddl`` sharding schema. If a change had occurred in one of | ||
those, router instances should reload the schema and reevaluate | ||
a request using an updated one. This document clarifies how schema | ||
is obtained, used and reloaded. | ||
|
||
## Space schema | ||
|
||
Related links: [#98](https://github.com/tarantool/crud/issues/98), | ||
[PR#111](https://github.com/tarantool/crud/pull/111). | ||
|
||
### How schema is stored | ||
|
||
Every ``crud`` router is a [``vshard``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/) | ||
router, the same applies to storages. In ``vshard`` clusters, spaces are | ||
created on storages. Thus, each storage has a schema (space list, | ||
space formats and indexes) on it. | ||
|
||
Every router has a [``net.box``](https://www.tarantool.io/en/doc/latest/reference/reference_lua/net_box/) | ||
connection to each storage it could interact with. (They can be | ||
retrieved with [``vshard.router.routeall``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/vshard_router/#lua-function.vshard.router.routeall) | ||
call.) Each ``net.box`` connection has space schema for an instance it | ||
is connected to. Router can access space schema by using ``net.box`` | ||
connection object contents. | ||
|
||
### When schema is used | ||
|
||
Space schema is used to flatten maps on `crud.*_object*` requests. | ||
Together with sharding info, the space schema is used to calculate | ||
``bucket_id`` to choose replicaset for a request. | ||
|
||
### How schema is reloaded | ||
|
||
Now there are no automatic reload of space schema supported by | ||
``net.box`` itself (see [tarantool/tarantool/6169](https://github.com/tarantool/tarantool/issues/6169)). | ||
``crud`` uses connection ``reload_schema`` handle (``ping`` wrapper, see | ||
[PR#111 comment](https://github.com/tarantool/crud/pull/111#issuecomment-765811556)) | ||
to reload schema. ``crud`` reloads each replicaset schema. If there are | ||
several requests to reload info, the only one reload fiber is started | ||
and every request waits for its completion. | ||
|
||
### When schema is reloaded and operation is retried | ||
|
||
The basic logic is as follows: if something had failed and space schema | ||
mismatch could be the reason, reload the schema and retry. If it didn't | ||
help after ``N`` retries (now ``N`` is ``1``), pass the error to the | ||
user. | ||
|
||
Retry with reload is triggered if | ||
- space not found; | ||
- ``bucket_id`` calculation has failed due to any reason; | ||
- operation had failed on storage, hash check is on and hashes mismatch. | ||
|
||
Let's talk a bit more about the last one. To enable hash check, the user | ||
should pass `add_space_schema_hash` option to a request. This option is | ||
always enabled for `crud.*_object*` requests. If hashes mismatch, it | ||
means the router space schema is inconsistent with the storage space | ||
schema, so we reload it. For ``*_many`` methods, reload and retry | ||
happens only if all tuples had failed with hash mismatch; otherwise, | ||
errors are passed to a user. | ||
|
||
Retries are counted per function it wraps, so in fact there could | ||
be more reloads than ``N`` for a single request. For example, if a user | ||
calls `insert_object`, object flatten has failed (and has succeeded | ||
after reload) and then storage insert has failed and hashes mismatch, | ||
reload will happen twice and there will be two retries (though it's two | ||
different blocks of code will be retried independently). Or if a user | ||
calls `insert_object_many`, each tuple flatten is wrapped independently | ||
(though it is hard to imagine a real case of multiple flatten retries: | ||
user will need to pass a batch of tables with different structure and | ||
change the schema several times to match the current one; in reality it | ||
always will be success after a single retry or a fast fail). | ||
|
||
### Alternative approaches | ||
|
||
One of the alternatives considered was to ping a storage instance on | ||
each request to refresh schema (see [PR#111 comment](https://github.com/tarantool/crud/pull/111#discussion_r562757016)). | ||
|
||
|
||
## Sharding schema | ||
|
||
Related links: [#166](https://github.com/tarantool/crud/issues/166), | ||
[PR#181](https://github.com/tarantool/crud/pull/181), | ||
[#237](https://github.com/tarantool/crud/issues/237), | ||
[PR#239](https://github.com/tarantool/crud/pull/239), | ||
[#212](https://github.com/tarantool/crud/issues/212), | ||
[PR#268](https://github.com/tarantool/crud/pull/268). | ||
|
||
### How schema is stored | ||
|
||
Again, ``crud`` cluster is a [``vshard``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/) | ||
cluster. Thus, data is sharded based on some key. To extract the key | ||
from a tuple and compute a [``bucket_id``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/vshard_architecture/), | ||
[``ddl``](https://github.com/tarantool/ddl) module is used. ``ddl`` | ||
module schema describes sharding key (how to extract data required | ||
to compute ``bucket_id`` from a tuple based on space schema) and | ||
sharding function (how to compute ``bucket_id`` based on the data). | ||
This information is stored on storages in some tables and Tarantool | ||
spaces: ``_ddl_sharding_key`` and ``_ddl_sharding_func``. ``crud`` | ||
module uses `_ddl_sharding_key` and `_ddl_sharding_func` spaces to fetch | ||
sharding schema: thus, you don't obliged to use ``ddl`` module and can | ||
setup only ``_ddl_*`` spaces manually, if you want. ``crud`` module uses | ||
plain Lua tables to store sharding info on routers and storages. | ||
|
||
### When schema is used | ||
|
||
Sharding schema is used to compute ``bucket_id``. Thus, it is used each | ||
time we need to execute a non-map-reduce request (``insert_*``, | ||
``replace_*``, ``update_*``, ``delete_*``, ``upsert_*``, ``get_*`` | ||
requests, most ``select`` and ``count`` requests, some ``pairs`` | ||
requests), ``bucket_id`` is not specified explicitly and there are no | ||
``force_map_call`` option. If there is no sharding schema specified, we | ||
use defaults: sharding key is primary key, sharding function is | ||
``vshard.router.bucket_id_strcrc32``. | ||
|
||
### How schema is reloaded | ||
|
||
Storage sharding schema (internal ``crud`` Lua tables) is updated on | ||
initialization (first time when info is requested) and each time someone | ||
changes ``_ddl_sharding_key`` or ``_ddl_sharding_func`` data -- we use | ||
``on_replace`` triggers. | ||
|
||
Routers fetch sharding schema if cache wasn't initialized yet or each | ||
time reload was requested. Reload could be requested with ``crud`` | ||
itself (see below) or by user (with ``require('crud.common.sharding_key').update_cache()`` | ||
or ``require('crud.common.sharding_func').update_cache()`` handles). | ||
The latter was deprecated after introducing automatic reload. | ||
|
||
It is worth noting that reload/initialization affects both sharding | ||
key and function caches at the same time, but fetch itself is triggered | ||
only if one cache requested by some function is not initialized yet. | ||
(I.e. if function cache is ok and key cache is not, they still both be | ||
reloaded.). If there are several requests to reload info, the only one | ||
reload fiber is started and every request waits for its completion. | ||
|
||
### When schema is reloaded and operation is retried | ||
|
||
Each request that uses sharding info passes sharding hashes from | ||
a router with a request. If hashes mismatch with storage ones, we return | ||
a specific error. The request retries if it receives a hash mismatch | ||
error from the storage. If it didn't help after ``N`` retries (now ``N`` | ||
is ``1``), the error is passed to the user. For ``*_many`` methods, | ||
reload and retry happens only if all tuples had failed with hash | ||
mismatch; otherwise, errors are passed to the user. | ||
|
||
### Alternative approaches | ||
|
||
There were different implementations of working with hash: we tried | ||
to compute it instead of storing pre-computed values, but pre-computed | ||
approach with triggers was better in terms of performance. It was also | ||
an option to ping storage before sending a request and verify sharding | ||
info relevance before sending the request with separate call, but it was | ||
discarded. |