KVS: Support valref pointing to zero length blob objects #1237

chu11 · 2017-10-13T00:41:57Z

This series of patches is to support valref treeobjects pointing to zero-length blobref data.

This seems like a lot of code for a pretty dumb corner case, but it came down to updating the KVS cache API to not assume non-NULL data pointer means valid and NULL data pointer is invalid. A flag was placed into the KVS cache entry to handle this, but it beget a number of API style changes.

For starters, cache_entry_get_raw () had to be modified so that a NULL data buffer returned wasn't automatically assumed to be error. This is an obvious refactoring that was needed. The rest are more subtle.

Under the old API:

cache_entry_create_json (NULL) - creates an empty/invalid cache entry
cache_entry_create_raw (NULL, 0) - creates a valid cache entry???

cache_entry_set_json (hp, NULL) - clears json from the cache entry
cache_entry_set_raw (hp, NULL, 0) - should do ...???

So for consistency, cache_entry_create_json() and cache_entry_set_json() now only take valid input (i.e. non-NULL json objects). The raw equivalents can take NULL & 0, b/c that's just
a zero length blobref.

New functions cache_entry_create_json_empty() and cache_entry_create_raw_empty() were created to create "empty" entries.

In order to clear data from an entry, cache_entry_clear_data() was added. The cache_entry_set_json() and cache_entry_set_raw() functions can no longer clear data from an entry by passing in NULL.

As a consequence, it's worth noting that cache_entry_set_raw() will also call wait_runqueue() when a zero-byte length buffer is set into the entry. This was not done before.

Debate on implementation:

Tiny part of me dislikes the function name "empty". Can discuss alternates. I considered using a flag type variable to say "I have data", but that seemed even uglier and more confusing.

I also considered removing the "empty" functions and supporting a cache_entry_set_type() function, to say these cache entries only support json or raw (i.e. user would call cache_entry_create(), then cache_entry_set_type()). But that's two function calls instead of one. I nixed that idea.

Also, for one error case I choose errno EBADE. Right one??

codecov-io · 2017-10-13T01:12:12Z

Codecov Report

Merging #1237 into master will increase coverage by 0.02%.
The diff coverage is 89.23%.

@@            Coverage Diff             @@
##           master    #1237      +/-   ##
==========================================
+ Coverage   78.13%   78.15%   +0.02%     
==========================================
  Files         154      154              
  Lines       28680    28702      +22     
==========================================
+ Hits        22408    22431      +23     
+ Misses       6272     6271       -1

Impacted Files	Coverage Δ
src/modules/kvs/commit.c	`76.68% <100%> (ø)`	⬆️
src/modules/kvs/lookup.c	`83.88% <100%> (ø)`	⬆️
src/modules/kvs/kvs.c	`64.4% <64.28%> (+0.66%)`	⬆️
src/modules/kvs/cache.c	`92.36% <95.91%> (+0.52%)`	⬆️
src/modules/connector-local/local.c	`77.38% <0%> (-1.31%)`	⬇️
src/common/libflux/mrpc.c	`85.49% <0%> (-1.18%)`	⬇️
src/common/libflux/response.c	`84.55% <0%> (-0.82%)`	⬇️
src/common/libutil/base64.c	`95.07% <0%> (-0.71%)`	⬇️
src/common/libflux/message.c	`81.6% <0%> (+0.11%)`	⬆️
... and 4 more

coveralls · 2017-10-13T01:12:33Z

Coverage decreased (-0.03%) to 78.689% when pulling 954ff9d on chu11:issue1232 into 1893b92 on flux-framework:master.

garlick · 2017-10-13T01:21:55Z

I'm going to be travelling tomorrow so I may not get to review this for a bit. One quick question though, do the cache_create() functions really need to be typed? I.e. does an invalid cache entry need a type?

chu11 · 2017-10-13T15:44:01Z

Are you referring to the fact I have a "NONE" data type? I suppose this is mostly for internal reasons, to signify the user has not yet assigned "JSON" or "RAW" data, so the cache entry doesn't know. I suppose that could be masked/hidden internally.

Thought, is "type" perhaps the wrong word to use?

chu11 · 2017-10-13T15:46:25Z

It occurred to me last night that instead of creating cache_entry_create_json_empty() and cache_entry_create_raw_empty(), I could have modified cache_entry_create() to always create an "empty" entry, but the user can now pass in a data type to it. So

cache_entry_create(NONE) - means I don't know what this entry will hold
cache_entry_create(JSON) - says I want this cache entry for json, but I don't have data yet
cache_entry_create(RAW) - says I want this cache entry for raw, but I don't have data yet

probably better than adding two new functions.

Well, I'll let the review happen on what's there for now. I can change this later.

Also, "NONE" maybe should be changed to "UNKNOWN" ... TBD

garlick · 2017-10-13T15:54:42Z

I was just thinking that one creation function could set the type to none/unknown, and it could get its type assigned when it becomes valid. Maybe that doesn't fit - I didn't look too closely. Sorry, getting packed up today so I'll be offline most of today, but I'll try to check in later.

chu11 · 2017-10-13T15:58:10Z

I like this idea enough I"ll go ahead and change it.

chu11 · 2017-10-13T17:40:47Z

As for your question, "it could get its type assigned when it becomes valid". Basically assigning a type (or none) is an option for the caller. In some cases, assigning the type ahead of time is effectively a flag saying, "Hey, this is what I'm expecting". I use this "flag" so that content-loading is aware what to expect.

chu11 · 2017-10-13T18:41:01Z

Red-did the cache_entry_create() function to take a data type parameter instead and re-pushed entire branch. Definitely feels better this way.

coveralls · 2017-10-13T18:57:41Z

Coverage decreased (-0.003%) to 78.709% when pulling 0797b8b on chu11:issue1232 into 2bddc2e on flux-framework:master.

chu11 · 2017-10-13T21:18:44Z

Looks like I hit a #1169 and a write error. re-run builds.

garlick · 2017-10-16T20:50:41Z

This looks fine, just needs a rebase.

It does seem like the cache interface could be simplified, but maybe not a topic for this PR. I'll open an issue to discuss it.

cache_entry_create() now takes a cache_data_type_t parameter, allowing to user to create a cache entry without any data attached to it, but at the same time indicate the type of data expected for the cache entry to store. This is an alternate to passing NULL to the function cache_entry_create_json ().

Cache entry creation API has slightly different logic now. Instead of creating an "empty" entry by setting inputs to NULL (i.e. cache_entry_create_json(NULL) or cache_entry_create_raw(NULL, 0)) these create functions now expect correct input. Callers wishing to create empty entries without data shall call cache_entry_create(). As a side effect, this now means that cache_entry_create_raw() now is not ambiguous about setting zero length data. Setting cache_entry_create_raw(NULL, 0) creates a cache entry with a zero length of data. Where as before, it was ambiguous if it was an empty entry or an entry with zero length of data. Update callers of these functions. Update/add unit tests.

Refactor cache_entry_get_raw to return data buffer and length as in/out parameters, returning integer to indicate success/failure. This is so the data buffer, if NULL, can be returned to the user.

Add unit tests appropriately.

Alter usage of cache_entry_set_json(). If user wishes to clear data, the user should use cache_entry_clear_data(). A NULL input to cache_entry_set_json() will now result in an error. Update code accordingly for change.

In the internal cache API, add a data_valid flag which indicates if the user has set data or not. This flag is used instead of checking data == NULL for validity, as checking the data pointer for NULL would not allow zero length content-blobs to be stored in the cache. A major fallout of from this addition is the change to the cache_entry_set_raw() function. It was previously unclear if passing NULL & length 0 to cache_entry_set_raw() indicated if you were setting a zero length data or trying to clear prior data. Now it is made clear that this is setting zero length data and clearing data is through cache_entry_clear_data(). In addition, calling cache_entry_set_raw() with zero length data can result in wait_runqueue() being run on hp->waitlist_valid. Added new unit tests for coverage.

Fixes flux-framework#1232

chu11 · 2017-10-16T20:56:50Z

just re-pushed w/ a rebase

chu11 · 2017-10-16T21:59:32Z

restarted some builds, hit a write error, #1169, and one that timed out for some reason. travis seems really slow right now

coveralls · 2017-10-16T22:25:34Z

Coverage increased (+0.004%) to 78.716% when pulling 598fd53 on chu11:issue1232 into 2bddc2e on flux-framework:master.

chu11 force-pushed the issue1232 branch from 954ff9d to 0797b8b Compare October 13, 2017 18:37

chu11 added 8 commits October 16, 2017 13:53

t/kvs: Add valref to zero-size content test

e7a01fc

modules/kvs: Refactor cache_entry_get_raw()

c78ae62

Refactor cache_entry_get_raw to return data buffer and length as in/out parameters, returning integer to indicate success/failure. This is so the data buffer, if NULL, can be returned to the user.

modules/kvs: Add cache_entry_clear_data()

1d927b5

Add unit tests appropriately.

modules/kvs: Alter cache_entry_set_json() usage

b4b6114

Alter usage of cache_entry_set_json(). If user wishes to clear data, the user should use cache_entry_clear_data(). A NULL input to cache_entry_set_json() will now result in an error. Update code accordingly for change.

modules/kvs: Support zero length raw data

598fd53

Fixes flux-framework#1232

chu11 force-pushed the issue1232 branch from 0797b8b to 598fd53 Compare October 16, 2017 20:56

garlick merged commit 9f03dda into flux-framework:master Oct 16, 2017

grondo mentioned this pull request May 10, 2018

0.9.0 Release #1479

Closed

chu11 deleted the issue1232 branch June 5, 2021 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KVS: Support valref pointing to zero length blob objects #1237

KVS: Support valref pointing to zero length blob objects #1237

chu11 commented Oct 13, 2017

codecov-io commented Oct 13, 2017 •

edited

Loading

coveralls commented Oct 13, 2017

garlick commented Oct 13, 2017

chu11 commented Oct 13, 2017

chu11 commented Oct 13, 2017 •

edited

Loading

garlick commented Oct 13, 2017 •

edited

Loading

chu11 commented Oct 13, 2017

chu11 commented Oct 13, 2017

chu11 commented Oct 13, 2017

coveralls commented Oct 13, 2017

chu11 commented Oct 13, 2017

garlick commented Oct 16, 2017

chu11 commented Oct 16, 2017

chu11 commented Oct 16, 2017

coveralls commented Oct 16, 2017

KVS: Support valref pointing to zero length blob objects #1237

KVS: Support valref pointing to zero length blob objects #1237

Conversation

chu11 commented Oct 13, 2017

codecov-io commented Oct 13, 2017 • edited Loading

Codecov Report

coveralls commented Oct 13, 2017

garlick commented Oct 13, 2017

chu11 commented Oct 13, 2017

chu11 commented Oct 13, 2017 • edited Loading

garlick commented Oct 13, 2017 • edited Loading

chu11 commented Oct 13, 2017

chu11 commented Oct 13, 2017

chu11 commented Oct 13, 2017

coveralls commented Oct 13, 2017

chu11 commented Oct 13, 2017

garlick commented Oct 16, 2017

chu11 commented Oct 16, 2017

chu11 commented Oct 16, 2017

coveralls commented Oct 16, 2017

codecov-io commented Oct 13, 2017 •

edited

Loading

chu11 commented Oct 13, 2017 •

edited

Loading

garlick commented Oct 13, 2017 •

edited

Loading