Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvs: implement lightweight key watch and API #1643

Merged
merged 21 commits into from
Sep 12, 2018

Conversation

garlick
Copy link
Member

@garlick garlick commented Sep 3, 2018

This PR finishes the alternate KVS watch API, based on passing a FLUX_KVS_WATCH flag to flux_kvs_lookup(); and the lightweight kvs-watch service, which only checks watched keys when they are mentioned in a commit. This strategy was discussed in #1556.

N.B. this PR falls short of implementing the "bulk" watch described in #1631 - this is for another PR.

In addition to the API and kvs-watch modifications, the flux kvs get subcommand now has a --watch and related options.

Still TODO: doc updates and sharness test(s).

@coveralls
Copy link

coveralls commented Sep 3, 2018

Coverage Status

Coverage increased (+0.01%) to 79.395% when pulling a6388c1 on garlick:kvs_watch2 into bdf296f on flux-framework:master.

@codecov-io
Copy link

codecov-io commented Sep 3, 2018

Codecov Report

Merging #1643 into master will increase coverage by 0.03%.
The diff coverage is 83.76%.

@@            Coverage Diff             @@
##           master    #1643      +/-   ##
==========================================
+ Coverage   79.36%   79.39%   +0.03%     
==========================================
  Files         179      179              
  Lines       32535    32682     +147     
==========================================
+ Hits        25821    25948     +127     
- Misses       6714     6734      +20
Impacted Files Coverage Δ
src/common/libflux/rpc.c 93.98% <100%> (+0.48%) ⬆️
src/modules/kvs-watch/kvs-watch.c 77.49% <75.47%> (-0.54%) ⬇️
src/cmd/flux-kvs.c 81.57% <87.09%> (+0.22%) ⬆️
src/common/libkvs/kvs_lookup.c 81.59% <92.3%> (+3.12%) ⬆️
src/modules/connector-local/local.c 72.76% <0%> (-1.43%) ⬇️
src/common/libflux/future.c 85.26% <0%> (-0.63%) ⬇️
src/common/libkvs/kvs_txn.c 74.71% <0%> (-0.57%) ⬇️
src/common/libkvs/kvs_watch.c 89.69% <0%> (-0.43%) ⬇️
src/bindings/lua/flux-lua.c 82.15% <0%> (-0.09%) ⬇️
... and 8 more

@chu11
Copy link
Member

chu11 commented Sep 4, 2018

looked through with a quick skim and nothing looked too out of place. Will wait for tests and take a deeper look at that time.

@garlick
Copy link
Member Author

garlick commented Sep 4, 2018

Thanks! I added a couple which greatly helps coverage but there are a bunch of corner cases to check still.

@garlick garlick force-pushed the kvs_watch2 branch 6 times, most recently from a41a070 to 93cff71 Compare September 10, 2018 21:06
@garlick
Copy link
Member Author

garlick commented Sep 11, 2018

This is probably getting close and could use some review by another pair of eyes.

Copy link
Member

@chu11 chu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One high level question. Was the intent to eventually remove the original "KVS watch" in a later PR?

I only mention b/c there is a behavior difference on watching commits where the value does not change (i.e. flux kvs put a=1 ; flux kvs put a=1). Old kvs watch would not return a change, but kvs watch module will.

@@ -96,6 +96,18 @@ static struct optparse_option get_opts[] = {
{ .name = "at", .key = 'a', .has_arg = 1,
.usage = "Lookup relative to RFC 11 snapshot reference",
},
{ .name = "order", .key = 'o', .has_arg = 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think has_arg needs to be 1?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, just a minor suggestion to use -s, --sort since many other tools do so

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure sort is the right word. What we're looking for is a word for "output one key at a time". Perhaps order is not the best, but not sure what would be better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops - thanks for pointing out the has_arg problem.

The --order option ensures that values are printed in the same order as specified on the command line, as opposed to random order if they are allowed to be fetched in parallel. Maybe --serial-lookup would be a better option name?

Copy link
Contributor

@grondo grondo Sep 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry! I was confused by the usage message

      .usage = "For multiple keys, ensure values are output in order",

I still don't understand what the option does exactly, but you are definitely right --sort isn't correct. Sorry for the noise. (On another note, did the flux-kvs(1) manpage need an update here?) (Edit: nevermind, saw your comment on that point below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code was changed to be asynchronous, so without the option, keys could be output in any order. Maybe I should just make this the default if --watch is not specified and we can avoid this option entirely?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's what I should do. Shaving off a few milliseconds for a multiple-key get that is rarely used is not worth much, while making the command confusing does harm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, thanks!

flux kvs namespace-remove testns4
'

test_done
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if I'm missing it here on the github review, but is -l or -o tested?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I think I should add some basic tests for those options.

@@ -22,6 +22,9 @@ flux_future_t *flux_rpc_raw (flux_t *h, const char *topic,
const void *data, int len,
uint32_t nodeid, int flags);

flux_future_t *flux_rpc_message (flux_t *h, const flux_msg_t *msg,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a new manpage entry for this function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes thanks!

/* Respond to watcher request, if appropriate.
* De-list and destroy watcher from namespace on error.
* De-hash and destroy namespace if watchers list becomes empty.
* De-hash and destroy namespace if watchers list become empty.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i actually think this should be becomes? :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep.

* futures are added to the w->lookups zlist in commit order, and an
* out of order result is queued until all the lookups ahead of it have
* been retired.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment sort of confused me. I think it's b/c the latter half of what this comment says is implemented in the lookup_continuation function. Perhaps it should be split up a bit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll reword that.

@@ -228,7 +240,7 @@ static struct optparse_subcommand subcommands[] = {
NULL
},
{ "get",
"[-j|-r|-t] [-a treeobj] key [key...]",
"[-j|-r|-t] [-a treeobj] [-o] [-l] [-w] [-c COUNT] key [key...]",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just realized, need to add to flux-kvs.adoc too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah! Thanks.

@garlick
Copy link
Member Author

garlick commented Sep 11, 2018

One high level question. Was the intent to eventually remove the original "KVS watch" in a later PR?

Yes but this PR is insufficient, due to the semantic differences pointed out in #1556 (including the one you mentioned). I think we'll need something like

FLUX_KVS_WATCH_CLASSIC = (FLUX_KVS_WATCH | FLUX_KVS_WAITCREATE | FLUX_KVS_PEDANTIC)

where FLUX_KVS_WAITCREATE is described in #1651

FLUX_KVS_PEDANTIC (or a better name) would check the key on every commit to its namespace (like current watch). I mean to open an issue on this flag as well.

The problem of returning multiple values could fall out of PEDANTIC, or it could be made part of the functionality on this PR. I could go either way - thoughts?

@grondo
Copy link
Contributor

grondo commented Sep 11, 2018

I'm just curious -- what happens if a user destroys a future with FLUX_KVS_WATCH without first calling flux_kvs_lookup_cancel? Will responses continue to be sent, but dropped? I see that disconnects also trigger a cancellation in the kvs-watch module, but will this cancel generate spurious ENODATA messages to a handle we know has disconnected? Sorry for the trivial questions.

@garlick
Copy link
Member Author

garlick commented Sep 11, 2018

what happens if a user destroys a future with FLUX_KVS_WATCH without first calling flux_kvs_lookup_cancel? Will responses continue to be sent, but dropped?

Yes. They will be dropped if the future was reset and then destroyed without canceling the watch, because the matchtag will be "leaked" by the RPC code. If the code is really wrong and the future is destroyed in a fulfilled state, the matchtag could be recycled, and erstwhile watch responses could be mixed up with other responses.

I see that disconnects also trigger a cancellation in the kvs-watch module, but will this cancel generate spurious ENODATA messages to a handle we know has disconnected?

The w->mute flag should prevent a response if canceled due to disconnect.

@grondo
Copy link
Contributor

grondo commented Sep 11, 2018

Great, thanks for the answers!

@garlick
Copy link
Member Author

garlick commented Sep 11, 2018

I think I've addressed the comments so far. LMK if I should squash and rebase.

@chu11
Copy link
Member

chu11 commented Sep 11, 2018

Looks good to me to squash and merge. One English nit, which I'm even debating if it should be changed. But I'll mention it:

If '-w', after the initial value, display the new value each time the key changes until interrupted,

Technically this isn't correct b/c writing the same value will lead to watch outputs. But, 99% of the time this is what happens and it is how people should think of the option working. So I think it's ok to leave the text like this?

@garlick
Copy link
Member Author

garlick commented Sep 12, 2018

I don't think we need to get to that level of detail in the flux kvs man page.

Possibly when we implement #1653 it will turn out to be easiest to just do the value check unconditionally anyway. Will have to see what it adds to "weight" of the operation. Maybe not much.

If the user specified teh FLUX_KVS_WATCH flag on
a flux_kvs_lookup(), redirect the request to the
kvs-watch module.

Note that FLUX_KVS_WATCH is invalid for a lookupat,
since the target is an "anchored" snapshot that
can never change.
Problem: when handling multiple responses to a lookup
with the FLUX_KVS_WATCH flag, cached results from a
previous response may mask the current one.

Always look at the 'val' treeobj in the response, and
if it has changed relative to the cached one, invalidate
the cache.
Rename internal flux_rpc_msg() to flux_rpc_message_nocopy(),
in preparation for introducing external flux_rpc_message()
function that takes a const flux_msg_t *.

Change the order of parameters to match the expected order
of the new function's parameters.
Problem: there is no mechanism to set the userid and rolemask
in a message sent with the flux_rpc_*() functions.

The ability to set the rolemask and userid of a message is
useful when sending a request on behalf of another user.
It only has an effect when running as the instance owner,
since otherwise, the connector rewrites the credentials
with those of the authenticated user.

To make this possible, expose low level RPC constructor
that accepts a request message as a parameter.  It is
sent out verbatim except for setting the matchtag (per
rpc flags) and nodeid.
Instead of handling get response synchronously, handle
it in a continuation and run the reactor until the
future is destroyed.
Add FLUX_KVS_WATCH flag to lookup request if --watch
is specified, then handle multiple responses.

If --count COUNT is specified, terminate the watch
once COUNT responses are received (cumulative over
all watched keys).

When --watch is specified, handle responses
asynchronously, which means for multiple keys,
values may be displayed in any order.
Add the --label option to prefix values with
"key=" so one can tell one value from another.

Otherwise, fetch keys synchronously, so that
the order of values displayed matches the order
of keys on the command line.  Some tests depend
on this.
Implement kvs-watch.lookup.

The 'struct watcher' is extended with fields to support
the key and state associated its lookup.

At each root update, the watched key is compared
against the key array transmitted in the kvs.setroot
event.  If a match is found, initiate a kvs.lookup
on the key, and direct the result to the user in a
response message.

The futures associated with the kvs.lookup requests are
stored in a zlist so that commit order can be preserved
even if lookup responses arrive out of order.

There is room for future optimization in this code:
1) Each root commit triggers the traversal of the
list of watchers.  This could be sped up with a hash
of keys, per namespace.
2) Multiple watchers of the same key are independent,
and will trigger a seperate kvs.lookup request per
watcher, per commit.

N.B. because a kvs.lookup can cross into a new namespace,
kvs.lookup RPCs are made with the watching user's creds.
This must be considered when optimizing 2) above.
Perhaps in the future, namespace + owner could be
returned in lookup responses.
Problem: kvs-watch returns ENODATA (effectively EOF per RFC 6)
when a watched namespace or key is removed.  Probably this
should a different error code, so that users can distinguish
this condition from cancellation.

Change return code to ENOENT.

Update getroot unit test
Problem: kvs-watch getroot continuation logs ENOTSUP
at LOG_ERR level, even though that can be triggered
in normal operation by an attempt to getroot a namespace
without authorization.

Don't log ENOTUSP errors.  Also suppress EPERM, which
might in the future be considered a more appropriate
error for the kvs module to return in this case.
Add a program that commits values to a KVS
key in sequence while watching the key, ensuring
that the watch responses occur in order.
Add a description of the FLUX_KVS_WATCH flag, and
the new flux_kvs_lookup_cancel() function.
@chu11
Copy link
Member

chu11 commented Sep 12, 2018

lgtm, will merge once travis passes

@garlick
Copy link
Member Author

garlick commented Sep 12, 2018

Had to restart two stalled builders and one (apparently) spurious valgrind error, but it finally passed.

@chu11 chu11 merged commit 6934d50 into flux-framework:master Sep 12, 2018
@garlick
Copy link
Member Author

garlick commented Sep 12, 2018

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants