-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make automated invalidation of caches on router on schema reload or ddl sharding keys update #212
Labels
bug
Something isn't working
Comments
|
3 tasks
ligurio
added a commit
that referenced
this issue
Sep 16, 2021
Previously there were two different ways to obtain bucket id in CRUD: - calculate bucket id automatically using primary key (default) - pass it from outside explicitly in options on CRUD operation call Users who uses DDL module [1] may specify sharding key (that are actually names of tuple fields), but it was not possible to use DDL sharding key for bucket id calculation. Now CRUD allows to use that custom sharding key to calculate bucket id, it will be done automatically when used DDL schema with non-empty sharding_key [1] or when space _ddl_sharding_key contains a tuple with space name and it's sharding key. Table below describe what operations supports custom sharding key: | CRUD method | Added sharding key support | | ---------------------------- | -------------------------- | | get() | Yes | | insert() / insert_object() | Yes | | delete() | Yes | | replace() / replace_object() | Yes | | upsert() / upsert_object() | Yes | | select() / pairs() | Yes | | update() | Yes | | upsert() / upsert_object() | Yes | | replace() / replace_object() | Yes | | min() / max() | No (not required) | | cut_rows() / cut_objects() | No (not required) | | truncate() | No (not required) | | len() | No (not required) | Limitations: - It's not possible to update sharding keys automatically when schema is updated on storages, see [2]. However it is possible to do it manually with sharding_key.update_sharding_keys_cache(). - CRUD select may lead map reduce in some cases, see [3]. 1. https://github.com/tarantool/ddl 2. #212 3. #213 Closes #166
ligurio
added a commit
that referenced
this issue
Sep 16, 2021
Previously there were two different ways to obtain bucket id in CRUD: - calculate bucket id automatically using primary key (default) - pass it from outside explicitly in options on CRUD operation call Users who uses DDL module [1] may specify sharding key (that are actually names of tuple fields), but it was not possible to use DDL sharding key for bucket id calculation. Now CRUD allows to use that custom sharding key to calculate bucket id, it will be done automatically when used DDL schema with non-empty sharding_key [1] or when space _ddl_sharding_key contains a tuple with space name and it's sharding key. Table below describe what operations supports custom sharding key: | CRUD method | Added sharding key support | | ---------------------------- | -------------------------- | | get() | Yes | | insert() / insert_object() | Yes | | delete() | Yes | | replace() / replace_object() | Yes | | upsert() / upsert_object() | Yes | | select() / pairs() | Yes | | update() | Yes | | upsert() / upsert_object() | Yes | | replace() / replace_object() | Yes | | min() / max() | No (not required) | | cut_rows() / cut_objects() | No (not required) | | truncate() | No (not required) | | len() | No (not required) | Limitations: - It's not possible to update sharding keys automatically when schema is updated on storages, see [2]. However it is possible to do it manually with sharding_key.update_sharding_keys_cache(). - CRUD select may lead map reduce in some cases, see [3]. 1. https://github.com/tarantool/ddl 2. #212 3. #213 Closes #166
ligurio
added a commit
that referenced
this issue
Sep 17, 2021
Previously there were two different ways to obtain bucket id in CRUD: - calculate bucket id automatically using primary key (default) - pass it from outside explicitly in options on CRUD operation call Users who uses DDL module [1] may specify sharding key (that are actually names of tuple fields), but it was not possible to use DDL sharding key for bucket id calculation. Now CRUD allows to use that custom sharding key to calculate bucket id, it will be done automatically when used DDL schema with non-empty sharding_key [1] or when space _ddl_sharding_key contains a tuple with space name and it's sharding key. Table below describe what operations supports custom sharding key: | CRUD method | Added sharding key support | | ---------------------------- | -------------------------- | | get() | Yes | | insert() / insert_object() | Yes | | delete() | Yes | | replace() / replace_object() | Yes | | upsert() / upsert_object() | Yes | | select() / pairs() | Yes | | update() | Yes | | upsert() / upsert_object() | Yes | | replace() / replace_object() | Yes | | min() / max() | No (not required) | | cut_rows() / cut_objects() | No (not required) | | truncate() | No (not required) | | len() | No (not required) | Limitations: - It's not possible to update sharding keys automatically when schema is updated on storages, see [2]. However it is possible to do it manually with sharding_key.update_sharding_keys_cache(). - CRUD select may lead map reduce in some cases, see [3]. 1. https://github.com/tarantool/ddl 2. #212 3. #213 Closes #166
ligurio
added a commit
that referenced
this issue
Sep 20, 2021
Previously there were two different ways to obtain bucket id in CRUD: - calculate bucket id automatically using primary key (default) - pass it from outside explicitly in options on CRUD operation call Users who uses DDL module [1] may specify sharding key (that are actually names of tuple fields), but it was not possible to use DDL sharding key for bucket id calculation. Now CRUD allows to use that custom sharding key to calculate bucket id, it will be done automatically when used DDL schema with non-empty sharding_key [1] or when space _ddl_sharding_key contains a tuple with space name and it's sharding key. Table below describe what operations supports custom sharding key: | CRUD method | Added sharding key support | | ---------------------------- | -------------------------- | | get() | Yes | | insert() / insert_object() | Yes | | delete() | Yes | | replace() / replace_object() | Yes | | upsert() / upsert_object() | Yes | | select() / pairs() | Yes | | update() | Yes | | upsert() / upsert_object() | Yes | | replace() / replace_object() | Yes | | min() / max() | No (not required) | | cut_rows() / cut_objects() | No (not required) | | truncate() | No (not required) | | len() | No (not required) | Limitations: - It's not possible to update sharding keys automatically when schema is updated on storages, see [2]. However it is possible to do it manually with sharding_key.update_sharding_keys_cache(). - CRUD select may lead map reduce in some cases, see [3]. 1. https://github.com/tarantool/ddl 2. #212 3. #213 Closes #166
ligurio
added a commit
that referenced
this issue
Sep 21, 2021
Previously there were two different ways to obtain bucket id in CRUD: - calculate bucket id automatically using primary key (default) - pass it from outside explicitly in options on CRUD operation call Users who uses DDL module [1] may specify sharding key (that are actually names of tuple fields), but it was not possible to use DDL sharding key for bucket id calculation. Now CRUD allows to use that custom sharding key to calculate bucket id, it will be done automatically when used DDL schema with non-empty sharding_key [1] or when space _ddl_sharding_key contains a tuple with space name and it's sharding key. Table below describe what operations supports custom sharding key: | CRUD method | Added sharding key support | | ---------------------------- | -------------------------- | | get() | Yes | | insert() / insert_object() | Yes | | delete() | Yes | | replace() / replace_object() | Yes | | upsert() / upsert_object() | Yes | | select() / pairs() | Yes | | update() | Yes | | upsert() / upsert_object() | Yes | | replace() / replace_object() | Yes | | min() / max() | No (not required) | | cut_rows() / cut_objects() | No (not required) | | truncate() | No (not required) | | len() | No (not required) | Limitations: - It's not possible to update sharding keys automatically when schema is updated on storages, see [2]. However it is possible to do it manually with sharding_key.update_sharding_keys_cache(). - CRUD select may lead map reduce in some cases, see [3]. 1. https://github.com/tarantool/ddl 2. #212 3. #213 Closes #166
ligurio
added a commit
that referenced
this issue
Sep 29, 2021
ligurio
added a commit
that referenced
this issue
Sep 29, 2021
ligurio
added a commit
that referenced
this issue
Sep 29, 2021
ligurio
added a commit
that referenced
this issue
Sep 30, 2021
ligurio
added a commit
that referenced
this issue
Sep 30, 2021
ligurio
added a commit
that referenced
this issue
Sep 30, 2021
ligurio
added a commit
that referenced
this issue
Sep 30, 2021
ligurio
added a commit
that referenced
this issue
Sep 30, 2021
ligurio
added a commit
that referenced
this issue
Oct 1, 2021
ligurio
added a commit
that referenced
this issue
Nov 18, 2021
ligurio
added a commit
that referenced
this issue
Nov 18, 2021
ligurio
added a commit
that referenced
this issue
Nov 18, 2021
ligurio
added a commit
that referenced
this issue
Nov 18, 2021
DifferentialOrange
added a commit
that referenced
this issue
Apr 19, 2022
Fetch sharding info hashes to router on ddl schema load. Hashes are stored in router metadata cache together with sharding info. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 19, 2022
Return error if router sharding info differs from storage sharding info. Comparison is based on sharding hash values. Hashes are provided with each relevant request. Hashes are extracted together with sharding key and sharding func definitions on router during request execution. After this patch, the performance of insert requests decreased by 5%, the performance of select requests decreased by 1.5%. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 19, 2022
If sharding info mismatch has happened, sharding info will be reloaded on router. After that, request will be retried with new sharding info (expect for pairs requests due to its nature, they must be retried manually). There are no detectable performance drops introduced in this patch. Closes #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 19, 2022
Return error if router sharding info differs from storage sharding info. Comparison is based on sharding hash values. Hashes are provided with each relevant request. Hashes are extracted together with sharding key and sharding func definitions on router during request execution. After this patch, the performance of insert requests decreased by 5%, the performance of select requests decreased by 1.5%. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 19, 2022
If sharding info mismatch has happened, sharding info will be reloaded on router. After that, request will be retried with new sharding info (expect for pairs requests due to its nature, they must be retried manually). There are no detectable performance drops introduced in this patch. Closes #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 19, 2022
Compute and store sharding key and sharding func hashes on storages. Hashes are updated with on_replace triggers. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 19, 2022
Rename sharding_metadata_cache to router_metadata_cache to distinct it from storage_metadata_hash. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 19, 2022
Fetch sharding info hashes to router on ddl schema load. Hashes are stored in router metadata cache together with sharding info. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 19, 2022
Return error if router sharding info differs from storage sharding info. Comparison is based on sharding hash values. Hashes are provided with each relevant request. Hashes are extracted together with sharding key and sharding func definitions on router during request execution. After this patch, the performance of insert requests decreased by 5%, the performance of select requests decreased by 1.5%. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 19, 2022
If sharding info mismatch has happened, sharding info will be reloaded on router. After that, request will be retried with new sharding info (expect for pairs requests due to its nature, they must be retried manually). There are no detectable performance drops introduced in this patch. Closes #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Compute and store sharding key and sharding func hashes on storages. Hashes are updated with on_replace triggers. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Rename sharding_metadata_cache to router_metadata_cache to distinct it from storage_metadata_hash. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Fetch sharding info hashes to router on ddl schema load. Hashes are stored in router metadata cache together with sharding info. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Return error if router sharding info differs from storage sharding info. Comparison is based on sharding hash values. Hashes are provided with each relevant request. Hashes are extracted together with sharding key and sharding func definitions on router during request execution. After this patch, the performance of insert requests decreased by 5%, the performance of select requests decreased by 1.5%. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
If sharding info mismatch has happened, sharding info will be reloaded on router. After that, request will be retried with new sharding info (expect for pairs requests due to its nature, they must be retried manually). There are no detectable performance drops introduced in this patch. Closes #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Since sharding schema reloads must be processed automatically after this patchset, there shouldn't be usual cases where user need to reload sharding info manually. Thus methods for manual sharding schema reload are deprecated and will be removed in future releases. Follows up #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Compute and store sharding key and sharding func hashes on storages. Hashes are updated with on_replace triggers. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Rename sharding_metadata_cache to router_metadata_cache to distinct it from storage_metadata_hash. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Fetch sharding info hashes to router on ddl schema load. Hashes are stored in router metadata cache together with sharding info. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Return error if router sharding info differs from storage sharding info. Comparison is based on sharding hash values. Hashes are provided with each relevant request. Hashes are extracted together with sharding key and sharding func definitions on router during request execution. After this patch, the performance of insert requests decreased by 5%, the performance of select requests decreased by 1.5%. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
If sharding info mismatch has happened, sharding info will be reloaded on router. After that, request will be retried with new sharding info (expect for pairs requests due to its nature, they must be retried manually). There are no detectable performance drops introduced in this patch. Closes #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Since sharding schema reloads must be processed automatically after this patchset, there shouldn't be usual cases where user need to reload sharding info manually. Thus methods for manual sharding schema reload are deprecated and will be removed in future releases. Follows up #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
If crud request uses tuple as input argument (insert, upsert and replace operations) and its bucket_id is empty, the module will fill this field and damage input argument tuple. This patch fixes this behavior. After this patch, performance of insert, upsert and replace has decreased by 5%. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Compute and store sharding key and sharding func hashes on storages. Hashes are updated with on_replace triggers. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Rename sharding_metadata_cache to router_metadata_cache to distinct it from storage_metadata_hash. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Fetch sharding info hashes to router on ddl schema load. Hashes are stored in router metadata cache together with sharding info. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Return error if router sharding info differs from storage sharding info. Comparison is based on sharding hash values. Hashes are provided with each relevant request. Hashes are extracted together with sharding key and sharding func definitions on router during request execution. After this patch, the performance of insert requests decreased by 5%, the performance of select requests decreased by 1.5%. Part of #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
If sharding info mismatch has happened, sharding info will be reloaded on router. After that, request will be retried with new sharding info (expect for pairs requests due to its nature, they must be retried manually). There are no detectable performance drops introduced in this patch. Closes #212
DifferentialOrange
added a commit
that referenced
this issue
Apr 20, 2022
Since sharding schema reloads must be processed automatically after this patchset, there shouldn't be usual cases where user need to reload sharding info manually. Thus methods for manual sharding schema reload are deprecated and will be removed in future releases. Follows up #212
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Imagine user have a cluster with running CRUD module. He wants to update a schema or update sharding keys in
_ddl_sharding_key
space. In such case we should update cache with sharding keys on router.Now we have a mechanism to follow schema changes on storage (see option
add_space_schema_hash
in CRUD operations and functionschema.get_space_schema_hash()
).Patch for this can look like this:
Things missed in a patch:
schema.get_space_schema_hash
follows only changes in spaces with data and not_ddl_sharding_key
and_ddl_sharding_func
.crud.insert()
does not setopts.add_space_schema_hash = true
(as opposite tocrud.insert_object()
.Part of #166
The text was updated successfully, but these errors were encountered: