-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvs_watch() can miss values due to merged commits #813
Comments
Considering that the root of the namespace can be the target of a watch, commit merging is fundamentally at odds with the goal of getting a watch response for every change of every possible watch target. Commit merging should be disabled in the KVS and we should look for other ways to recoup the performance lost. |
As a first step let's make this configurable so we can see what we're dealing with performance wise. |
@garlick Now that I understand the KVS code atleast medium well, I was trying to figure out "passing args to modules", when I saw this. It seems that an option to make commit merging configurable already exists? Unsure if you were thinking of something different.
|
Oops, well... does it work? (I must have forgotten about that when I opened the issue) |
Code review wise, it appears it does. Although I don't see any tests for this yet. |
A test addition for the option would be welcome. How we want to deal with the bigger problem is an open question. To put a finer point on the situation: a If you're watching a state machine where the current state is represented as the value of a key, as learned the hard way by @dongahn a while ago, some state transitions may be lost. There are ways to work around this but the behavior is unexpected and feels inconsistent with the transactional design. A partial "solution" to this problem might be to add an optional "no merge" flag param to |
Actually, a better proposal might be to add that flag to |
Yeah, my first goal was to try and add some tests. Not sure if I can kludge |
@garlick: Yeah, JSC is now essentially a thin layer of flux events, not using the watch capability. So this is not a problem. Introducing a parameter to |
@garlick Been working on a KVS test to see if the commit-merge option is working. When commit merging is disabled ( I often get less than the 8 notifications. It appears on occasion that a change has simply been missed and a watch notification isn't sent (or isn't received?). Is it possible if lots of commits occur quickly that a watcher could "miss" a commit? Lots of debug statements in the KVS commit side of the code, and I think that part is working properly. Beginning to look at the watcher side b/c no idea how that works. Hopefully I didn't do something obviously dumb. Test is t/kvs/commitwatch.c if you're curious. https://github.com/chu11/flux-core/tree/issue813 |
@garlick Ugh, I'm stupid ...
multiple commits can occur on the kvs side during the parse & print part of the code. |
Oh right, you'll want a regular watch there that gets a stream of responses, one per change. I didn't quite understand why you sometimes do a fence in this test? Commits to the same key under a fence are guaranteed to produce only one watch response. Or was that the point? |
The fence is only when running a test when |
What's the purpose of using the fence there? The fence builds a single commit until nprocs have called |
@garlick Ahhh. I now see what you're saying. My original intent was to make a test to verify merging works with Hmmm. Any thoughts on how one might be able to guarantee a merge happening? |
Hard to do since it requires multiple commits to arrive in the same reactor loop. Crazy idea: add a flag to the commit message that, purely for testing, instructs the master to break up every operation into its own commit. Then you'll be guaranteed to have put them all queued up in the same loop that the commit message was received... |
Is it a possibility to introduce a testing hook into the prepare watcher to sync with the external test committers? The prepare handler only returns when the external testers finish sending their commits... I also wonder if one can test this with some statistical guarantee as opposed to deterministic guarantee. Just random houghts. |
@garlick Are you thinking of a flag variable one might pass to @dongahn The statistical thing was something I was thinking about. For example, launching 100 commits and getting < 100 watch responses. But it of course isn't guaranteed. At some point this one unit test may be more work than is worthwhile. At some point we just grep the broker log to make sure a commit merge occurred :-) |
I agree probably too much work for too little gain. (To answer your question, I was thinking just change the protocol and create the message manually. Yep, too much work) |
Yeah, I uncommented some of the debugging in the broker and just manually checked the broker log to make sure the option was working. So that would probably be the simplest thing to do if an actual unit test were made. |
Maybe this should be the way. A tester sends |
Thinking about a way to test these difficult to reproduce timing conditions or corner-cases in general is a good idea. Lots of systems have separate interfaces for testing (e.g. something like the test-only commit flag described here, or better yet a separate Sorry if that is out of left field, but perhaps a larger issue could be created if it is of interest. |
@dongahn The other consideration I had was that some "timeout" would have to be added to the test, such that we can assume no additional watch events appear. So that also adds a pause into the unit tests that I disliked. I suppose one here and there is ok, but we wouldn't want to start adding of ton of them. |
Thinking about this a bit last weekend, I may need a special feature for testing otherwise testing the final solution (such as Jim's suggested flag to I'm now thinking, I could add a special msg handler in the KVS, say |
Ugh, only after playing around with the above idea, did I realize that with such a pause in committing will make a |
Tried to experiment with Could be solved several ways, but I think the most reasonable is to add the "don't merge" flag into |
Or maybe if any puts in a commit have the unmergeable flag, then the whole commit should be considered unmergeable? |
Had considered that, but how different is that from passing the flag via Perhaps there as an API usage subtlety that would make a flag in |
My apologies - I kind of lost track of what we're doing here when I responded last. What's the purpose of the flag you're now working on? We talked about a couple different things: one was an API extension that would prevent certain keys from being merged (useful for state machine use case). The other was a test flag that would break up puts within a commit into separate commits that are likely/guaranteed to be merged. |
Ahhh, I'm talking about the API extension flag. So here's figuratively what I was thinking above:
Internally, the So my initial feeling is we should put the merge/unmerge flag in the commit instead.
saying all the puts within a commit/fence are mergeable or unmergeable. They can't be mixed. Your suggestion above (assuming you were talking the API extension) "if any puts in a commit have the unmergeable flag, then the whole commit should be considered unmergeable". To me, this seems not much different than simply moving the flag to the |
Ah, thanks for clarifying. I would assume you'd only want one flag to prevent merging, with the default being "try to merge if merging is enabled and there happen to be multiple commits pending". My thought above was that there might be an opportunity to do more merging if the flag were associated with individual puts. A commit containing a=1 with the nomerge flag, and another containing b=1 with the nomerge flags could be merged since no value of a or b could be lost if the two commits were merged. However, I think there are some corner cases to handle if done that way, such as if one commit contained "a.b.c=1" (nomerge) and another contained "unlink a". There's not too much to be gained by doing it this way anyway, so my vote would be for the commit flag. |
Your example made me think of a potential issue. Example of three commits that come in in this order and are queued up at the same time: commit 1 is mergeable: A=1 In this case, the three get merged, and the only watch notice that is sent out is A=3. But lets say we have this. commit #1 is mergeable: A=1 Hypothetically, we could merge commit #1 & #3. So there would be two watch notices, A=3 and A=2. The final setting would be A=2. This doesn't seem right to me. Should unmergeable commits act as a "barrier" of sorts? Where you cannot merge any commits that cross an unmergeable one? |
Hmm, yes I think you're right, only "adjacent" commits should be mergeable so the order is preserved. For now anyway. We may have to introduce some new semantics for ordered requests once we implement overlay self-healing. |
Support flag KVS_NO_MERGE in kvs commit/fence functions to inform KVS to not merge commits with other ones when commits are finalized. Fixes flux-framework#813
The kvs commit code attempts to merge contemporaneous commit requests to minimize intermediate directory versions landing in the content store and
kvs.setroot
event handing.If a key changes in two commits and those commits are merged, only the last in value will be committed, and thus the
kvs.setroot
triggered watch response will be generated only for the last value.The text was updated successfully, but these errors were encountered: