Skip to content

Commit

Permalink
doc: describe schema reload implementation
Browse files Browse the repository at this point in the history
This patch adds a small document describing how space and sharding
schemas works in the module. It covers what schemas are, how we store,
use and reload them in requests.

Closes #253
  • Loading branch information
DifferentialOrange committed Aug 16, 2022
1 parent a577434 commit f707f65
Showing 1 changed file with 227 additions and 0 deletions.
227 changes: 227 additions & 0 deletions doc/dev/schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,227 @@
# Database schema information design document

Two types of schema are used in ``crud`` requests: ``net.box`` spaces
schema and ``ddl`` sharding schema. If a change had occurred in one of
those, router instances should reload the schema and reevaluate
a request using an updated one. This document clarifies how schema
is obtained, used and reloaded.

## Table of Contents
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->

- [Space schema](#space-schema)
- [How schema is stored](#how-schema-is-stored)
- [When schema is used](#when-schema-is-used)
- [How schema is reloaded](#how-schema-is-reloaded)
- [When schema is reloaded and operation is retried](#when-schema-is-reloaded-and-operation-is-retried)
- [Alternative approaches](#alternative-approaches)
- [Sharding schema](#sharding-schema)
- [How schema is stored](#how-schema-is-stored-1)
- [When schema is used](#when-schema-is-used-1)
- [How schema is reloaded](#how-schema-is-reloaded-1)
- [When schema is reloaded and operation is retried](#when-schema-is-reloaded-and-operation-is-retried-1)
- [Alternative approaches](#alternative-approaches-1)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

## Space schema

Related links: [#98](https://github.com/tarantool/crud/issues/98),
[PR#111](https://github.com/tarantool/crud/pull/111).

### How schema is stored

Every ``crud`` router is a [``vshard``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/)
router, the same applies to storages. In ``vshard`` clusters, spaces are
created on storages. Thus, each storage has a schema (space list,
space formats and indexes) on it.

Every router has a [``net.box``](https://www.tarantool.io/en/doc/latest/reference/reference_lua/net_box/)
connection to each storage it could interact with. (They can be
retrieved with [``vshard.router.routeall``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/vshard_router/#lua-function.vshard.router.routeall)
call.) Each ``net.box`` connection has space schema for an instance it
is connected to. Router can access space schema by using ``net.box``
connection object contents.

### When schema is used

Space schema is used
- to flatten objects on `crud.*_object*` requests;
- to resolve update operation fields if updating
by field name is not supported natively;
- to calculate ``bucket_id`` to choose replicaset for a request
(together with sharding info);
- in `metadata` field of request response, so it could be used later
for `crud.unflatten_rows`.

### How schema is reloaded

``net.box`` schema reload works as follows. For each request (we use ``call``s
to execute our procedures) from client (router) to server (storage) we
receive a schema version together with response data. If schema versions mismatch,
client reloads schema. The request is not retried. ``net.box`` reloads
schema before returning synchronous ``call`` result to a user, so the next
request in a fiber will use updated schema. (See [tarantool/tarantool/6169](https://github.com/tarantool/tarantool/issues/6169).)

``crud`` cannot implicitly rely on ``net.box`` reloads: ``crud`` requests
use ``net.box`` spaces schema to build ``net.box`` ``call`` request data,
so if something is not relevant anymore, a part of this request data need
to be recalculated. ``crud`` uses connection ``reload_schema`` handle
(see [PR#111 comment](https://github.com/tarantool/crud/pull/111#issuecomment-765811556))
to ping the storage and updates `net.box` schema if it is outdated. ``crud``
reloads each replicaset schema. If there are several requests to reload info,
the only one reload fiber is started and every request waits for its completion.

### When schema is reloaded and operation is retried

The basic logic is as follows: if something had failed and space schema
mismatch could be the reason, reload the schema and retry. If it didn't
help after ``N`` retries (now ``N`` is ``1``), pass the error to the
user.

Retry with reload is triggered
- before a network request if an action that depends on the schema has failed:
- space not found;
- object flatten on `crud.*_object*` request has failed;
- ``bucket_id`` calculation has failed due to any reason;
- if updating by field name is not supported natively and id resolve
has failed;
- network operation had failed on storage, hash check is on and hashes mismatch.

Let's talk a bit more about the last one. To enable hash check, the user
should pass `add_space_schema_hash` option to a request. This option is
always enabled for `crud.*_object*` requests. If hashes mismatch, it
means the router schema of this space is inconsistent with the storage
space schema, so we reload it. For ``*_many`` methods, reload and retry
happens only if all tuples had failed with hash mismatch; otherwise,
errors are passed to a user.

Retries with reload are processed by wrappers that wraps "pure" crud operations
and utilities. Retries are counted per function, so in fact there could be more
reloads than ``N`` for a single request. For example, `crud.*_object*` code looks
like this:
```lua
local function reload_schema_wrapper(func, ...)
for i = 1, N do
local status, res, err, reload_needed = pcall(func, ...)
if err ~= nil and reload_needed then
process_reload()
-- ...
end
end
end

function crud.insert_object(args)
local flatten_obj, err = reload_schema_wrapper(flatten_obj_func, space_name, obj)
-- ...
return reload_schema_wrapper(call_insert_func, space_name, flatten_obj, opts)
end
```

For `crud.*_object_many`, each tuple flatten is retried separately (though it
is hard to imagine a real case of multiple successful flatten retries):
```lua
function crud.insert_object_many(args)
for _, obj in ipairs(obj_array) do
local flatten_obj, err = reload_schema_wrapper(flatten_obj_func, space_name, obj)
-- ...
end
-- ...
return reload_schema_wrapper(call_insert_many_func, space_name, flatten_obj_array, opts)
end
```

### Alternative approaches

One of the alternatives considered was to ping a storage instance on
each request to refresh schema (see [PR#111 comment](https://github.com/tarantool/crud/pull/111#discussion_r562757016)),
but it was declined due to performance degradation.


## Sharding schema

Related links: [#166](https://github.com/tarantool/crud/issues/166),
[PR#181](https://github.com/tarantool/crud/pull/181),
[#237](https://github.com/tarantool/crud/issues/237),
[PR#239](https://github.com/tarantool/crud/pull/239),
[#212](https://github.com/tarantool/crud/issues/212),
[PR#268](https://github.com/tarantool/crud/pull/268).

### How schema is stored

Again, ``crud`` cluster is a [``vshard``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/)
cluster. Thus, data is sharded based on some key. To extract the key
from a tuple and compute a [``bucket_id``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/vshard_architecture/),
we use [``ddl``](https://github.com/tarantool/ddl) module data.
``ddl`` module schema describes sharding key (how to extract data
required to compute ``bucket_id`` from a tuple based on space schema)
and sharding function (how to compute ``bucket_id`` based on the data).
This information is stored on storages in some tables and Tarantool
spaces: ``_ddl_sharding_key`` and ``_ddl_sharding_func``. ``crud``
module uses `_ddl_sharding_key` and `_ddl_sharding_func` spaces to fetch
sharding schema: thus, you don't obliged to use ``ddl`` module and can
setup only ``_ddl_*`` spaces manually, if you want. ``crud`` module uses
plain Lua tables to store sharding info on routers and storages.

### When schema is used

Sharding schema is used
- to compute ``bucket_id``. Thus, it is used each time we need to execute
a non-map-reduce request[^1] and `bucket_id` is not specified by user.
If there is no sharding schema specified, we use defaults: sharding key
is primary key, sharding function is ``vshard.router.bucket_id_strcrc32``.

[^1]: It includes all ``insert_*``, ``replace_*``, ``update_*``,
``delete_*``, ``upsert_*``, ``get_*`` requests. ``*_many`` requests
also use sharding schema to find a storage for each tuple. ``select``,
``count`` and ``pairs`` requests use sharding info if user conditions
have equal condition for a sharding key. (User still can force map-reduce
with `force_map_call`. In this case sharding schema won't be used.)

### How schema is reloaded

Storage sharding schema (internal ``crud`` Lua tables) is updated on
initialization (first time when info is requested) and each time someone
changes ``_ddl_sharding_key`` or ``_ddl_sharding_func`` data — we use
``on_replace`` triggers.

Routers fetch sharding schema if cache wasn't initialized yet or each
time reload was requested. Reload could be requested with ``crud``
itself (see below) or by user (with ``require('crud.common.sharding_key').update_cache()``
or ``require('crud.common.sharding_func').update_cache()`` handles).
The handles was deprecated after introducing automatic reload.

The sharding information reload procedure always fetches all sharding keys
and all sharding functions disregarding of a reason that triggers the reload.
If there are several requests to reload info, the only one
reload fiber is started and every request waits for its completion.

### When schema is reloaded and operation is retried

Retry with reload is triggered
- if router and storage schema hashes mismatch on request that
uses sharding info.

Each request that uses sharding info passes sharding hashes from
a router with a request. If hashes mismatch with storage ones, we return
a specific error. The request retries if it receives a hash mismatch
error from the storage. If it didn't help after ``N`` retries (now ``N``
is ``1``), the error is passed to the user. For ``*_many`` methods,
reload and retry happens only if all tuples had failed with hash
mismatch; otherwise, errors are passed to the user.

Retries with reload are processed by wrappers that wraps "pure" crud
operations, same as in space schema reload. Since there is only one
use case for sharding info, there is only one wrapper per operation:
around the main "pure" crud function with network call.

### Alternative approaches

There were different implementations of working with hash: we tried
to compute it instead of storing pre-computed values (both on router
and storage), but pre-computed approach with triggers was better in
terms of performance. It was also an option to ping storage before
sending a request and verify sharding info relevance before sending
the request with separate call, but it was also declined due to
performance degradation.

0 comments on commit f707f65

Please sign in to comment.