Make automated invalidation of caches on router on schema reload or ddl sharding keys update #212

ligurio · 2021-09-13T15:16:11Z

Imagine user have a cluster with running CRUD module. He wants to update a schema or update sharding keys in _ddl_sharding_key space. In such case we should update cache with sharding keys on router.
Now we have a mechanism to follow schema changes on storage (see option add_space_schema_hash in CRUD operations and function schema.get_space_schema_hash()).

Patch for this can look like this:

diff --git a/crud/borders.lua b/crud/borders.lua
index ac77cef..b702d09 100644
--- a/crud/borders.lua
+++ b/crud/borders.lua
@@ -108,9 +108,6 @@ local function call_get_border_on_router(border_name, space_name, index_name, op
         local storage_result = storage_result[1]
         if storage_result.err ~= nil then
             local need_reload = schema.result_needs_reload(space, storage_result)
-            if need_reload == true then
-                sharding.schema_reload_actions(space_name, storage_result.ddl_sharding_key)
-            end
             return nil, BorderError:new("Failed to get %s: %s", border_name, storage_result.err), need_reload
         end
 
diff --git a/crud/common/schema.lua b/crud/common/schema.lua
index 93afc01..bbfae66 100644
--- a/crud/common/schema.lua
+++ b/crud/common/schema.lua
@@ -214,7 +214,6 @@ function schema.wrap_func_result(space, func, args, opts)
     else
         result.res = filter_tuple_fields(func_res, opts.field_names)
     end
-    result.ddl_sharding_key = schema.fetch_ddl_sharding_key(box, space.name)
 
     return result
 end
diff --git a/crud/common/sharding.lua b/crud/common/sharding.lua
index 528b4fe..07d0275 100644
--- a/crud/common/sharding.lua
+++ b/crud/common/sharding.lua
@@ -109,14 +109,6 @@ function sharding.is_sharding_key_in_primary_index(space_name, primary_index, sh
     return sharding_key_in_primary_index_cache[space_name]
 end
 
-function sharding.schema_reload_actions(space_name, sharding_key)
-    dev_checks('string', 'table')
-
-    ddl_sharding_keys_cache[space_name] = sharding_key
-    sharding_key_in_primary_index_cache[space_name] = nil
-    sharding_key_fieldnos_cache[space_name] = nil
-end
-
 -- Build an array with sharding key values.
 local function build_sharding_key(key, index_parts, sharding_key_fieldno_map)
     dev_checks('table', 'table', 'table')
diff --git a/crud/insert.lua b/crud/insert.lua
index 3b9fe1f..9d4332c 100644
--- a/crud/insert.lua
+++ b/crud/insert.lua
@@ -84,9 +84,6 @@ local function call_insert_on_router(space_name, tuple, opts)
 
     if storage_result.err ~= nil then
         local need_reload = schema.result_needs_reload(space, storage_result)
-        if need_reload == true then
-            sharding.schema_reload_actions(space_name, storage_result.ddl_sharding_key)
-        end
         return nil, InsertError:new("Failed to insert: %s", storage_result.err), need_reload
     end
 
diff --git a/crud/replace.lua b/crud/replace.lua
index f52f331..26d5721 100644
--- a/crud/replace.lua
+++ b/crud/replace.lua
@@ -88,9 +88,6 @@ local function call_replace_on_router(space_name, tuple, opts)
 
     if storage_result.err ~= nil then
         local need_reload = schema.result_needs_reload(space, storage_result)
-        if need_reload == true then
-            sharding.schema_reload_actions(space_name, storage_result.ddl_sharding_key)
-        end
         return nil, ReplaceError:new("Failed to replace: %s", storage_result.err), need_reload
     end
 
diff --git a/crud/upsert.lua b/crud/upsert.lua
index 87f16c0..d91d8ba 100644
--- a/crud/upsert.lua
+++ b/crud/upsert.lua
@@ -90,9 +90,6 @@ local function call_upsert_on_router(space_name, tuple, user_operations, opts)
 
     if storage_result.err ~= nil then
         local need_reload = schema.result_needs_reload(space, storage_result)
-        if need_reload == true then
-            sharding.schema_reload_actions(space_name, storage_result.ddl_sharding_key)
-        end
         return nil, UpsertError:new("Failed to upsert: %s", storage_result.err), need_reload
     end

Things missed in a patch:

schema.get_space_schema_hash follows only changes in spaces with data and not _ddl_sharding_key and _ddl_sharding_func.
crud.insert() does not set opts.add_space_schema_hash = true (as opposite to crud.insert_object().
anything else?

Part of #166

The text was updated successfully, but these errors were encountered:

Totktonada · 2021-09-13T16:32:38Z

anything else?

crud.insert() does not set opts.add_space_schema_hash = true (as opposite to crud.insert_object().

Previously there were two different ways to obtain bucket id in CRUD: - calculate bucket id automatically using primary key (default) - pass it from outside explicitly in options on CRUD operation call Users who uses DDL module [1] may specify sharding key (that are actually names of tuple fields), but it was not possible to use DDL sharding key for bucket id calculation. Now CRUD allows to use that custom sharding key to calculate bucket id, it will be done automatically when used DDL schema with non-empty sharding_key [1] or when space _ddl_sharding_key contains a tuple with space name and it's sharding key. Table below describe what operations supports custom sharding key: | CRUD method | Added sharding key support | | ---------------------------- | -------------------------- | | get() | Yes | | insert() / insert_object() | Yes | | delete() | Yes | | replace() / replace_object() | Yes | | upsert() / upsert_object() | Yes | | select() / pairs() | Yes | | update() | Yes | | upsert() / upsert_object() | Yes | | replace() / replace_object() | Yes | | min() / max() | No (not required) | | cut_rows() / cut_objects() | No (not required) | | truncate() | No (not required) | | len() | No (not required) | Limitations: - It's not possible to update sharding keys automatically when schema is updated on storages, see [2]. However it is possible to do it manually with sharding_key.update_sharding_keys_cache(). - CRUD select may lead map reduce in some cases, see [3]. 1. https://github.com/tarantool/ddl 2. #212 3. #213 Closes #166

Describe functionality and current limitations (#212 and #213) with custom sharding key in CHANGELOG and README. Closes #166

Describe functionality and current limitations (#212, #213 and #219) with custom sharding key in CHANGELOG and README. Closes #166

Fetch sharding info hashes to router on ddl schema load. Hashes are stored in router metadata cache together with sharding info. Part of #212

Return error if router sharding info differs from storage sharding info. Comparison is based on sharding hash values. Hashes are provided with each relevant request. Hashes are extracted together with sharding key and sharding func definitions on router during request execution. After this patch, the performance of insert requests decreased by 5%, the performance of select requests decreased by 1.5%. Part of #212

If sharding info mismatch has happened, sharding info will be reloaded on router. After that, request will be retried with new sharding info (expect for pairs requests due to its nature, they must be retried manually). There are no detectable performance drops introduced in this patch. Closes #212

Return error if router sharding info differs from storage sharding info. Comparison is based on sharding hash values. Hashes are provided with each relevant request. Hashes are extracted together with sharding key and sharding func definitions on router during request execution. After this patch, the performance of insert requests decreased by 5%, the performance of select requests decreased by 1.5%. Part of #212

If sharding info mismatch has happened, sharding info will be reloaded on router. After that, request will be retried with new sharding info (expect for pairs requests due to its nature, they must be retried manually). There are no detectable performance drops introduced in this patch. Closes #212

Compute and store sharding key and sharding func hashes on storages. Hashes are updated with on_replace triggers. Part of #212

Rename sharding_metadata_cache to router_metadata_cache to distinct it from storage_metadata_hash. Part of #212