-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvs: [cleanup] improve isolation in internal cache #1274
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1274 +/- ##
==========================================
+ Coverage 77.88% 77.94% +0.06%
==========================================
Files 154 154
Lines 29124 29097 -27
==========================================
- Hits 22684 22681 -3
+ Misses 6440 6416 -24
|
src/modules/kvs/cache.c
Outdated
if (!(cpy = malloc (len))) | ||
return -1; | ||
memcpy (cpy, data, len); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, lingering empty line
src/modules/kvs/cache.h
Outdated
* that alrdady has data stored. | ||
* to set new data in a cache entry will silently succeed. | ||
* A treeobj object passed to cache_entry_set_treeobj() will be | ||
* json_decref()'d for a cache entry that alrdady has data stored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo "already"
src/modules/kvs/cache.h
Outdated
* cache_entry_set_treeobj() will be json_decref()'d for a cache entry | ||
* that alrdady has data stored. | ||
* to set new data in a cache entry will silently succeed. | ||
* A treeobj object passed to cache_entry_set_treeobj() will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct "A treeobj object passed to cache_entry_set_treeobj() will be json_decref()'d for a cache entry that alrdady has data stored."? Doesn't seem to be the case anymore.
src/modules/kvs/test/cache.c
Outdated
data = NULL; | ||
|
||
ok (cache_entry_set_raw (e, NULL, 0) < 0 | ||
ok (cache_entry_set_raw (e, data, 4) < 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, I think we want to keep the cache_entry_set_raw (e, NULL, 0)
test, and also add cache_entry_set_raw (e, data, 4)
as a new test to detect EBADE. Both are good corner case catches.
@garlick overall looks good, lots of good cleanup. Just some nits I found along the way. |
Thanks - here come a few commits to address your issues. I'll squash once we're done here. |
I went ahead and squashed those fixes. Hope that's OK... |
LGTM, I'm not entirely sure why the merge button isn't green though. Everything seems to have passed. Perhaps something buggy in travis. I'm going to restart the builder. |
hmmm, travis and github don't seem synced up, it's really slow. Will merge whenever that's resolved. |
I restarted one builder that got stuck in the wreck tests (or maybe it was running slow and only got that far -hard to tell). |
Uh, it looks like I at least missed a comment in |
ahhh, just like you did in |
We can eliminate |
Still need to cull some cache unit tests but wanted your opinion @chu11 on whether dropping |
src/modules/kvs/kvs.c
Outdated
} | ||
} | ||
|
||
if (!json_is_null (rootdir)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it's important to leave a comment here indicating that if an error occurs in prime_cache_with_rootdir()
, it's no big deal. We still set the setroot and things are generally fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did make the function return void and put a note about failure being OK over the function. More blatant warning needed you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, just a bit more blatant is what I'm thinking.
Good job on cleaning up Overall, I think trying to get rid of |
src/modules/kvs/kvs.c
Outdated
void *data = NULL; | ||
int len; | ||
|
||
if (treeobj_validate (rootdir) < 0 || treeobj_is_dir (rootdir)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!treeobj_is_dir()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just ensuring that the "rootdir" really is a dir, not some other object. Unlikely since the master would have already checked that. Drop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking it should it be opposite logic. If is not-dir, then error out. Right now it errors if it is a dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, reverse logic, got it!
Problem: cache_entry_set_raw() and cache_entry_set_treeobj() take ownership of a data/json_t pointer allocated in the caller. The caller must be trusted not to free or modify the data afterward, or the cache will be corrupted. Copy the data, making it easier to guarantee that the result is not modified or freed by the original caller. In the case of json_t, cache the encoded data but skip caching the original object. The object will be decoded from the data (and then cached) upon first access. Update users in kvs.c and commit.c. Update unit tests.
Problem: treeobj that is to be stored in the cache is encoded once to calculate its hash, and again to be stored in raw form. Just encode the treeobj in raw form, then calculate its hash and store it raw.
Problem: rootdir that is to be stored in the cache is encoded once to calculate its hash, and again to be stored in raw form. Change store_initial_rootdir() to use treeobj_encode(), then cache_entry_set_raw().
Problem: the name "rootdir" is used to refer to the root blobref, which is easily confused with the root directory object. Rename "rootdir" to "rootref" where appropriate. Create a 'struct kvsroot' to hold the root blobref and sequence number, since these are tightly coupled and there will need to be more than one pair for when planned support for multiple namespaces is implemented.
Problem: update kvs.sync, kvs.getroot, kvs.setroot (rpc), and kvs.setroot (event) protocol uses naming inconsistent with current norms in the code. Change "rootdir" to "rootref", and "rootdirval" to "rootdir" where appropriate.
@chu11: this look OK (last 2 commits)? If so I'll squash. |
yup, squash away |
OK, those are squashed now. |
Problem: the important kvs.setroot event handler is a bit complicated by the fact that the directory object is optionally provided in the event message as an optimization. Split the optimization off to a separate function, don't try too hard to handle the case of an existing but dirty/invalid cache entry, and recompute the hash after encoding the directory rather than trusting the hash received in the message to match. This eliminates the last user of cache_entry_set_treeobj().
This function no longer has any users except in tests. Replicate it as a convenience function in the tests.
Drop tests that poke at cache_entry_set_treeobj() which is now gone, or expect cache entries containing raw versus treeobj data to behave differently in ways that are no longer possible.
As discussed in #1264,
cache_entry_set_raw()
andcache_entry_set_treeobj()
take ownership of a data/json_t pointer allocated in the caller. The caller must be trusted not to free or modify the data afterward, or the cache will be corrupted. This seems a bit unsafe.Copy the data, increasing isolation between the cache and its callers.
In the treeobj case, avoid caching the json_t object directly, since it is hard to ensure that the caller won't change it later. We could
json_deep_copy()
it, but it is perhaps cheaper to skip caching it here, and instead allow the object to be regenerated from the encoded data on first use (if it gets used at all).Plus one small optimization: avoid an extra call to treeobj_encode in the commit path, where we had to first encode the object to compute its hash key, then again to store it. Just encode it once, then compute the hash key, then store the raw data.