-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add v0 of versioning to AAE hashtree locks, upgrades, and object hashing #1473
Conversation
…atch pre 2.1 vclocks which require a rewrite to be safe Added riak_kv boolean environment variable `hash_only_vclock` which enables/disabled the vclock hashing logic. Additionally, when enabled, we will no longer do the expensive compare/logging on read-repair to catch old pre2.1 vclocks who require a rewrite. Additional logic must be added to `yz_kv.erl` so the hashes in the entropy_data element of the Solr documents match what is stored in KV AAE. Upgrading the hash logic requires clearing all YZ documents and re-indexing everything. This is because we store the obj hash in the Solr document. This will hopefully change in the future to allow for easier AAE changes.
Add a static version “v0” to the data_root of the riak_kv AAE directory. This allows us to isolate old/newer trees as well as support downgrades which will not allow old and new trees to exchange either locally or across MDC.
New error `bad_version` added to the return from `riak_kv_index_hashtree:do_get_lock` if the requested version and the local version of the hash tree do not match. Requires changes in yokozuna and riak_repl to fully function.
Change boolean configuration variable to integer version starting with 0. Change all specs from atom() to non_neg_integer() for version number. Add extra logging for invalid non-integer hash version configurations.
Was not correctly handling case where Result list was empty of valid object responses or only a single valid response.
Add version 0 of object hashing(config `object_hash_version`) which hashes only the vclock to represent a version of an object. Also add versioning to index_hashtree get_lock API which now takes a Version parameter. The absence of a version is assumed to be `undefined` and will only be allowed to get a lock if the hash on disk is in legacy format. Upgraded hash trees are located within the anti_entropy data_dir under the directory v0. Currently the only supported version is 0 but the code allows for support of increasing integer versions. Add verbose riak_object:equals check to get_core for all read operations who do not trigger read repair. This logic is to detect potentially pre-2.1 data which may be impacted by old vector clock bugs and should be rewritten before upgrading the hash version for safety.
WIP commit in case of computer failure or something like that.
…l recovering some history
Also, add some testing logs and debug functions.
@@ -116,16 +124,27 @@ release_lock(Pid) -> | |||
%% will try to acquire a concurrency lock. If successsful, the request is | |||
%% then forwarded to the relevant index_hashtree to acquire a tree lock. | |||
%% If both locks are acquired, the pid of the remote index_hashtree is | |||
%% returned. | |||
%% returned. This function assumes an undefined version of the hashtree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of stating what the function does, I feel like this comment would be more helpful if it explained why it does it. Something like "this is left over for compatibility in mixed clusters with pre-2.2 nodes, from before we added versioning to the hashtree."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change.
case [RObj || {_Idx, {ok, RObj}} <- Results] of | ||
[] -> | ||
ok; | ||
[_|[]] -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very small nitpick, but the pattern I normally see to match a singleton list is just [_]
. I'm sure this works just as well, but feels a little bit unidiomatic.
Additionally, cleaned up the anode code which killed the hash tree on upgrade. Removed one binary layer from the riak_object:hash and added the build locks(different from concurrency lock(doh)) to the upgrade procedure so we won’t kill trees and then have them sit empty until we can get a build lock.
@bsparrow435 So it occurred to me this morning that we should probably have an option to disable automatically upgrading to the new tree format. It doesn't necessarily have to be documented or supported, but I think it would be good to provide an escape hatch. If nothing else, it may make testing much easier if we want to, say, write a riak_test module that starts with 2.2 and then performs a downgrade (e.g. something like |
+1 0e6bb3e |
Add v0 of versioning to AAE hashtree locks, upgrades, and object hashing Reviewed-by: nickelization
@borshop merge |
This PR adds version 0 of AAE hashing to riak_kv_entropy_manager, riak_kv_index_hashtree, riak_object, and the riak_kv_vnode. Specifically, we've added logic which requires getting a hashtree lock to provide an appropriate version and then our hashtree version determines what version of riak_object:hash will be used for the incoming data.
Upgrading is completed automatically for the user. The upgrade uses a combination of riak_core_capability with a new riak_kv variable object_hash_version and logic within the riak_kv_entropy manager to intelligently upgrade hashtrees once all conditions have been met. THe upgrade flow is as follows:
upgrade
atom included in the start opts.Open areas for discussion: