-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvs: eliminate per-job namespace event subscriptions on rank 0 #2777
Conversation
Definitely not a silver bullet, but there's an improvement here. There isn't much of a change in "single" alloc mode (still ~50 job/s), likely because the alloc mode is the limiter. However, when we enable "unlimited" alloc mode for sched-simple, I'm getting steady ~70 job/s on this PR branch:
#!/bin/bash
echo "Default test:"
flux python bulksubmit.py jobs/*.json
echo
echo "sched-simple in unlimited alloc mode:"
flux module reload sched-simple unlimited
flux python bulksubmit.py jobs/*.json Before:
After:
(the progress bar in the 2nd case is very bursty -- I think we're hitting general kvs busy-ness now) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look reasonable to me. Thanks!
src/modules/kvs/kvs.c
Outdated
} | ||
if (ctx->rank != 0) { | ||
if (asprintf (&setroot_topic, "kvs.setroot-%s", ns) < 0) { | ||
errno = ENOMEM; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asprintf()
already sets errno on error, so might want to drop the explicit errno = ENOMEM
assignment in the cases you are touching in this PR.
Since the missing coverage is mainly these error paths, it might increase coverage a bit too.
re-pushed with |
I had forgotten to run
I suspect improving performance past this will involve trying to eek out small gains in bunch of different areas. Edit: OR we could think of some batching mechanism, tell the KVS to make > 1 namespace in a single RPC request? But I suspect that would take major changes in job-manager, job-exec, etc. and that's way down the line. |
I assume that the namespace_create/destroy is no longer a bottleneck for job throughput, and the bottleneck has now moved to the job-exec system (lots of kvs activity with jobs writing to eventlogs, etc.) I think there will be some opportunities for batching up the final kvs commits for jobs when we redo the exec system which might help some. Right now the final write and kvs move of the guest namespace to the main namepace are done under a separate commit for each job. |
src/modules/kvs/kvs.c
Outdated
goto cleanup; | ||
} | ||
|
||
if (flux_event_subscribe (ctx->h, "kvs.namespace-remove") < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the error handling for these 3 flux_event_subscribe
calls is exactly the same, you could get another small bump in coverage by combining these conditionals into one. Not important though, so only if you feel like it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, actually I can combine the flux_event_subscribe
calls above it as well.
Consolidate multiple calls to flux_event_subscribe(), that go through the same error path, into a single if statement.
When a namespace is removed, unsubscribe from the kvs.namespace-removed-<NS> event. Fixes flux-framework#2762
On rank 0, we have to subscribe to all events on all KVS namespaces, as rank 0 must have knowledge of all KVS namespaces. On the other hand, workers only need to know about the KVS namespaces that callers are using on that rank. To improve KVS namespace create performance, subscribe to all KVS namespaces on rank 0 once on initialization, instead of subscribing to each KVS namespace as it is created. On workers, subscribe to only the KVS namespaces that are necessary. Fixes flux-framework#2727
asprintf already sets errno on error, so there is no need to explicitly set errno = ENOMEM on asprintf errors.
re-pushed w/ the cleanup @grondo suggested. I was able to cleanup in multiple places, so there's 1 additional cleanup patch added. |
Quick additional thought: could we rename the events so that only one subscription is necessary per namespace in the rank > 0 case? |
Oh, good idea. I'd be interested to see if there is any effect in the job throughput benchmark. |
one builder failed with
assuming timeout b/c of slow travis? |
Hm, I've seen that same failure before. We should probably bump that timeout up I suppose... |
Was pondering this and an obvious solution did come to mind. But perhaps there's event sub/pub trickery I don't know about? If I renamed the events like this: But then I think I have to send the setroot event out two times? Once to But maybe there's a naming trick I haven't thought about. |
Argh, it failed twice in a row. I'll give it one more try, otherwise I'm bumping the timeout up . |
3 times in a row! |
I'm clearly not that clever, as @garlick figured out we can do |
Increase test timeouts throughout tests to workaround slowness in travis.
re-pushed with increased timeouts in Will work on the renaming events in another PR, as that will involve modifying code in other modules (e.g. kvs-watch). |
Codecov Report
@@ Coverage Diff @@
## master #2777 +/- ##
==========================================
- Coverage 81.06% 81.05% -0.02%
==========================================
Files 250 250
Lines 39399 39407 +8
==========================================
+ Hits 31939 31940 +1
- Misses 7460 7467 +7
|
woo hoo, it finally passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice improvement!
I'm just going to edit the title for more clarity when release notes are developed. |
Setting merge-when-passing. |
Per discussion in #2727. I'll just throw up my commit message here:
Also threw in a fix for #2762