-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KVS oom() and xzmalloc() cleanup #1124
KVS oom() and xzmalloc() cleanup #1124
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1124 +/- ##
=========================================
- Coverage 78.07% 77.57% -0.5%
=========================================
Files 184 184
Lines 31430 31668 +238
=========================================
+ Hits 24538 24567 +29
- Misses 6892 7101 +209
|
hmmm. the 40.27% dff target isn't too surprising. But the project coverage going down 0.53% seems rather high. But then again, that's only 156 lines of code, which doesn't seem too alarming given all of the ENOMEM paths. I'll see if I can tweak anything and get the coverage better. |
Yeah, the unchecked paths mostly seem like error paths, and it is very difficult to get coverage there. However, it would seem like some of these error paths would be really good to check, and for example |
I just peeked at this real quick to check on the coverage report, but I really appreciate some of the nice code comments you added in tricky sections. Really helpful for someone browsing the code. Thanks! |
Nice work @chu11 as usual! I haven't looked in detail at how it might apply here but we do have the debug hooks for modules that are set with |
@grondo Yeah, the |
pushed a refactor creating a common function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few minor comments but on the whole this is a really positive set of changes.
src/modules/kvs/kvs.c
Outdated
goto done; | ||
} | ||
if (!blobref || blobref[blobref_size - 1] != '\0') { | ||
errno = EPROTO; | ||
flux_log_error (ctx->h, "%s", __FUNCTION__); | ||
flux_log_error (ctx->h, "%s: EPROTO", __FUNCTION__); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to print "content_store_get: EPROTO: Protocol Error". Maybe change "EPROTO" to "invalid blobref"?
src/modules/kvs/kvs_util.c
Outdated
oom (); | ||
if (!(s = json_dumps (tmp, flags))) | ||
oom (); | ||
if (!(tmp = json_null ())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe just replace json_null()
and json_dumps()
calls with strdup ("null")
?
src/modules/kvs/commit.c
Outdated
bool rc = strlen (s) > BLOBREF_MAX_STRING_SIZE; | ||
free (s); | ||
return rc; | ||
return rc ? 1 : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just make rc an int and set it before the free, then return it directly?
src/modules/kvs/commit.c
Outdated
bool rc = strlen (s) > BLOBREF_MAX_STRING_SIZE; | ||
free (s); | ||
return rc; | ||
return rc ? 1 : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: make new utility function that calls kvs_util_json_dumps()
internally:
// returns 0 on success, -1 on failure
int kvs_util_json_encoded_size (json_t *o, size_t *size);
then below, use it to set a new variable size_t value_encoded_size
, and test it for > BLOBREF_MAX_STRING_SIZE
right there in commit_unroll()
.
Not a big deal, but might be a little clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, I see what you mean. Yeah, that'll probably clarify things a bit. But more importantly it'll use kvs_util_json_dumps()
, which I had missed earlier.
src/modules/kvs/kvs.c
Outdated
/* Return true if load successful, false if stalling */ | ||
static bool load (kvs_ctx_t *ctx, const href_t ref, wait_t *wait, json_t **op) | ||
/* Return 1 if load successful, 0 if stalling, -1 on error */ | ||
static int load (kvs_ctx_t *ctx, const href_t ref, wait_t *wait, json_t **op) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be clearer to add a stalled
parameter to the function and just return 0 on success, -1 on failure?
@@ -253,10 +254,13 @@ static int commit_unroll (commit_t *c, int current_epoch, json_t *dir) | |||
goto done; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment block describing return value for commit_unroll()
needs an update (still refers to boolean return).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, this is all the way back from my commit API refactoring. Will add a cleanup.
src/modules/kvs/commit.c
Outdated
@@ -325,9 +332,11 @@ static int commit_link_dirent (commit_t *c, int current_epoch, | |||
errno = ENOMEM; | |||
goto done; | |||
} | |||
if (json_object_set_new (dir, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is workable, but would it be easier to just dispense with j_dirent_create()
and call json_pack()
in all the places where it's used? It doesn't really do very much, and it might be easier to keep track of refcounts and error behavior if we were more direct. Ignore if that's out of scope here.
Just pushed fixes for all the tiny things @garlick found. If looks good, can squash and then rebase on master. |
Those changes look good! |
3df439d
to
2df537e
Compare
just pushed a rebased + squashed tree. Didn't squash every cleanup patch, as some of them seemed more like fixes/cleanup from the prior PRs, so they should stay separate. |
hmmm, getting valgrind errors. But I only squashed and rebased compared to the last push. Wonder what I did. |
Hm, that is a bit scary. I hope it is not a strange false positive. |
I think I may have just read the travis results wrong. Normally "coverage/coveralls" is the first one listed and "continuous-integration/travis-ci" is second. I saw the red x and green checkmark and assumed travis had passed and coverage was bad. Think I found the mem-leak I introduced, and perhaps another minor cleanup I can fix. |
2df537e
to
cb64774
Compare
re-pushed, i forgot to free memory in the new |
Add flux log error messages in places there should be log messages. Fix up flux log error message formats for consistency.
Move log function above asserts, so log message will definitely occur.
Begin refactoring removal of oom () calls. Begin by removing oom () calls where it is easy to simply replace with error responses. i.e. functions that already return errors and callers check for errors, so slipping in a new return of ENOMEM is relatively simple. Remove include of oom.h where appropriate.
Begin refactoring away functions that will oom() and exit on ENOMEM errors. Replace calls of xstrdup(), xzmalloc(), and xasprintf() with strdup(), calloc(), and asprintf() respectively and return ENOMEM to callers appropriately. Begin by replacing where it is easy to simply replace with error responses. i.e functions that already return errors and callers check for errors, so slipping in a new return of ENOMEM is relatively simple. Remove header include of xzmalloc.h where appropriate.
kvs_util_json_dumps() and kvs_util_json_hash() can now return ENOMEM as an error. Adjust callers of these functions appropriately to handle error.
Have cache_get_stats() and cache_expire_entries() return -1 on error. Adjust callers appropriately to handle potential error.
When putting a dirty cache entry in the item callback list, do not oom(), instead cleanup the dirty cache entry and return ENOMEM appropriately.
Have fileval_big() return -1 on error, 0 on false, 1 on true. Adjust callers appropriately to handle potential error.
Refactor load() function so it can return an error instead of only returning true/false on load/stall.
Have wait_queue_create() return NULL on ENOMEM. The potential for error in wait_queue_create() cascades to cache_entry_wait_notdirty() and cache_entry_wait_valid(), which now can return errors. Update all callers of cache_entry_wait_notdirty() and cache_entry_wait_valid() to handle potential errors. Refactor cleanup_dirty_cache_entry() in kvs.c and commit.c into commit_cleanup_dirty_cache_entry() common function. Update unit tests for cache_entry_wait_notdirty() and cache_entry_wait_valid(). Add unit test for commit_cleanup_dirty_cache_entry().
Do not oom() on errors in j_dirent_create(), return ENOMEM appropriately. Fix all callers of j_dirent_create() to check for error and handle error appropriately.
With refactor / change of j_dirent_create(), now remove oom() call from json_object_set_new() calls and return error appropriately.
Call calloc() over xzmalloc() in cache_entry_create() and return ENOMEM appropriately. Update all callers to handle error appropriately.
Replace kvs_util_json_copydir() with json_copy() and check for ENOMEM errors appropriately. Remove unnecessary kvs_util.h includes. Update unit tests appropriately.
Return -1 on error from wait_addqueue() and 0 on success. Update all callers to check for error appropriately. Update unit tests appropriately.
Handle errors appropriately and update callers to handle errors.
Add missing unit test for removing object from cache entry
Instead of using two jansson calls, just strdup the string "null".
Add new kvs_util_json_encoded_size() function determine the size of an object consistent to the format used by kvs_util_json_dumps(). Use this function instead of fileval_big() in commit API. Add unit tests appropriately.
cb64774
to
e464830
Compare
rebased on master and re-pushed |
Finally through travis! Ready @chu11? |
yup! |
Like other refactorings that have been going on in flux-core, this one removes a large number of
oom()
calls and functions that calloom()
(notablyxzmalloc()
andxstrdup()
). InsteadENOMEM is detected, properly return as an error, check for error, log error, etc.
A few
oom()
s remain (wait_runqueue()
and the merging ops functions). Those will require a tad more re-architecture and logic updates so I will put those in another PR. The changes in this PR are relatively more grunted out fixes without much thought except for cleanup path handling.I suspect the coverage on this diff will be bad. I'd be surprised if it was even greater than 50% coverage, as almost the entirety of it is handling ENOMEM errors and the cascading checks for functions that can now return an ENOMEM error.