-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvs: convert missing_refs_list to hash or deal with duplicates #1751
Comments
I actually began to make this change, but wasn't sure if it was a net win. Basically, if a user were to do a kvs transaction with many writes to the same directory, such as
and the reference for directory "dir" happens to not yet be loaded, this would generate 26 missing references to be looked up, and 26 lookups. So ... A) is this common? I don't think so? B/c in the above scenario, I imagine most of the time, it's more like:
so the missing reference for "dir" isn't an issue. B) if it is maybe something to consider, is detecting duplicates in the references list a net win? Regardless of how its done. |
was talking to @garlick about this, and he reminded me that while the above is true, higher level code in the cache / wait data structures probably handles this. @garlick's memory is far better than mine and he's right, although the code is a tad non-optimal. Basically in the core missing references loop is:
So the worst case of sending multiple rpcs to load the same data from the content store is avoided. However, a small non-optimal thing is that every identical missing reference would be added to the waitlist on a cache entry. So in my worst case example above, the waitlist queue on the cache entry would be 26 long. At the minimum some comments to clarify in the code should be done to explain this. So I'll leave this open until I add some comments. |
While investigating #1747, it occurred to me that the
missing_refs_list
that is returned inkvstxn
could have duplicate references. Perhaps it'd be better to internally represent it as a hash, so that duplicate missing references aren't looked up twice.The text was updated successfully, but these errors were encountered: