kvs: rename kvs namespace events to limit calls to `flux_event_subscribe()` #2779

chu11 · 2020-02-26T23:43:12Z

As @garlick mentions in #2777, we could further reduce the number of calls to flux_event_subscribe() and flux_event_unsubscribe() through clever naming of the events and ensuring there are common substrings.

The text was updated successfully, but these errors were encountered:

chu11 · 2020-02-27T18:58:58Z

I'm clearly not that clever, as @garlick figured out we can do kvs.namespace-%s-setroot as the pattern, so rank 0 can subscribe to kvs.namespace and the workers can subscribe to kvs.namespace-%s.

Doh! There is also a kvs.namespace-created-%s event. Which the primary KVS module does not listen for, but the kvs-watch module does. Gotta get a little more creative on names or just modify the created event name.

chu11 · 2020-02-27T23:42:51Z

ugh ... i am wasting more time thinking about a good naming scheme than I'd like to admit:

kvs.setroot - subscribed to by kvs & kvs-watch module
kvs.namespace-removed - subscribed to by kvs module & kvs-watch module
kvs.error - subscribed to only by kvs module
kvs.namespace-create - subscribed to only by kvs-watch module

I thought of kvs.event-internal and kvs.event-external (but abbreviated), but there's multiple events that are used by both modules.

I didn't want to do something stupid like "kvs.create-namespace", which works but I'm just being stupid about creating a different "prefix" substring.

I am almost inclined to just leave as is now. If I can't come up with a good crossover naming scheme, then perhaps it should be as is. It simply means too many events have specialized purposes or "wide" usage.

grondo · 2020-02-28T00:16:35Z

We should try to avoid the back-to-back subscribes everywhere though if possible. My guess is that this might have a measurable impact on the high throughput jobs case since the kvs modules can't do anything else while they are synchronously waiting for responses to subscribe requests. It might be a better trade-off to recv and drop the events you aren't interested in, than have many, many subscriptions.

If you just threw out a branch with this kind of change we could see if there is even any impact on the job workload.

chu11 · 2020-02-28T00:35:43Z

If you just threw out a branch with this kind of change we could see if there is even any impact on the job workload.

Seems like a good idea to try. We want to remove the 3 successive flux_event_subscribe() calls in kvs-watch as well.

chu11 · 2020-02-28T00:39:52Z

Another thought, at what point to we just have to accept that we should wait for (or make higher priority) #1557?

grondo · 2020-02-28T01:38:14Z

Even if we have a fix for #1157, less calls to subscribe will be beneficial.

chu11 · 2020-02-28T18:49:09Z

@grondo is there a job workload you've been using for this and #2727

grondo · 2020-02-28T18:58:38Z

Yeah, just using a bulksubmit script I have laying around:

Here's a current version. Run with flux python bulksubmit.py jobs/*.json.
I've been testing with 1024 sleep 0 jobs I think, e.g.

flux mini run --dry-run sleep 0 > jobs/0.json
for i in `seq 1 1023`; do
  cp jobs/0.json jobs/$i.json
done

It is also useful to reload sched-simple in unlimited alloc mode, e.g. I run on fluke with something like

$ srun --pty --mpi=none -N16 flux start sh -c 'flux module reload sched-simple unlimited && flux python bulksubmit jobs/*json'

import time
import sys
import flux
from flux import job
from flux import constants

t0 = time.time()

h = flux.Flux()
jobs = []

class QueueAdmin:
    data = {'query_only': False, 'enable': False, 'reason': "Testing"}

    def __init__(self, h):
        self.h = h

    def start(self):
        msg = dict(self.data, enable = True)
        return self.h.rpc("job-manager.alloc-admin", msg).get()

    def stop(self):
        msg = dict(self.data)
        return self.h.rpc("job-manager.alloc-admin", msg).get()

    def status(self):
        msg = dict(self.data, query_only = True)
        p = self.h.rpc("job-manager.alloc-admin", msg).get()
        if p["enable"]:
            return "scheduling enabled"
        else:
            return "scheduling disabled, reason={}".format(p["reason"])


label="bulksubmit"

def log(s):
    print(label + ": " + s)

def progress(fraction, length=72, suffix=""):
    fill = int(round(length * fraction))
    bar = '\u2588' * fill + '-' * (length - fill)
    s = '\r|{0}| {1:.1f}% {2}'.format(bar, 100*fraction, suffix)
    sys.stdout.write(s)
    if fraction == 1.:
        sys.stdout.write('\n')

def submit_cb(f, arg):
    jobs.append (job.submit_get_id (f))


log("Starting...")
q = QueueAdmin(h)
q.stop()
log(q.status())

for file in sys.argv[1:]:
    with open(file) as jobspec:
        job.submit_async(h, jobspec.read(), waitable=True).then(submit_cb)

if h.reactor_run(h.get_reactor(), 0) < 0:
    h.fatal_error("reactor start failed")

# print(jobs)

total = len(jobs)
dt = time.time() - t0
jps = len(jobs)/dt
log("submitted {0} jobs in {1:.2f}s. {2:.2f}job/s".format(total, dt, jps))

q.start()
t0 = time.time()
log(q.status())

count = 0
while count < total:
    jobid, result, s = job.wait(h)
    if not result:
        print("{}: {}".format(jobid, s))
    count = count + 1
    if count == 1:
        log("First job finished in about {0:.3f}s".format(time.time() - t0))
    suffix = "({0:.1f} job/s)".format(count/(time.time() - t0))
    progress(count/total, length=58, suffix=suffix)

dt = time.time() - t0
log("Ran {0} jobs in {1:.1f}s. {2:.1f} job/s".format(total, dt, total/dt))

# vi: ts=4 sw=4 expandtab

chu11 · 2020-02-29T00:08:00Z

So running the benchmark above, there's a net win renaming the namespaces, removing calls to flux_event_subscribe(), but incurring a few extra events that are dropped.

Before (code of PR #2777)

bulksubmit: Starting...                                                                                                                                                                                            
bulksubmit: scheduling disabled, reason=Testing                                                                                                                                                                    
bulksubmit: submitted 1024 jobs in 1.73s. 590.63job/s                                                                                                                                                              
bulksubmit: scheduling enabled                                                                                                                                                                                     
bulksubmit: First job finished in about 1.043s                                                                                                                                                                     
bulksubmit: Ran 1024 jobs in 15.1s. 67.7 job/s

After

bulksubmit: Starting...                                                                                                                                                                                            
bulksubmit: scheduling disabled, reason=Testing                                                                                                                                                                    
bulksubmit: submitted 1024 jobs in 2.15s. 475.93job/s                                                                                                                                                              
bulksubmit: scheduling enabled                                                                                                                                                                                     
bulksubmit: First job finished in about 0.631s                                                                                                                                                                     
bulksubmit: Ran 1024 jobs in 15.0s. 68.1 job/s

this was the best run for "Before" and the worst run for "After", so the average better performance was a tad better than this ( 68.4 to 67.1).

As I thought about the 4 events, these are only two events that will get dropped:

kvs.error will be dropped by kvs-watch, but under the assumption errors are rare, shouldn't be a big deal.

kvs.namespace-create - will be dropped by the kvs module everytime a namespace is created

So really we're trading the drops of all the kvs.namespace-create for the fewer event subscriptions. Which appears to be a slight win overall for this benchmark.

chu11 · 2020-03-03T00:13:57Z

After discussion at the meeting, decided this should go in.

My primary concern is that unlike other performance fixes, there was a clear "speed 1 thing up, slow another thing down" tradeoff, and just because performance is a win in this case, doesn't mean it'll be a win later. And we don't have performance tests in travis.

But given low probability of changes in the future, this should go in. I'll of course add many comments :P

In the kvs and kvs-watch modules, collapse multiple calls to flux_event_subscribe() or flux_event_unsubscribe() into a single call using the same event substrings. Fixes flux-framework#2779

chu11 self-assigned this Feb 27, 2020

chu11 mentioned this issue Mar 3, 2020

kvs: rename kvs namespace events to limit calls to flux_event_subscribe() #2792

Merged

mergify bot closed this as completed in 225518f Mar 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvs: rename kvs namespace events to limit calls to `flux_event_subscribe()` #2779

kvs: rename kvs namespace events to limit calls to `flux_event_subscribe()` #2779

chu11 commented Feb 26, 2020

chu11 commented Feb 27, 2020 •

edited

Loading

chu11 commented Feb 27, 2020 •

edited

Loading

grondo commented Feb 28, 2020

chu11 commented Feb 28, 2020

chu11 commented Feb 28, 2020

grondo commented Feb 28, 2020

chu11 commented Feb 28, 2020

grondo commented Feb 28, 2020

chu11 commented Feb 29, 2020 •

edited

Loading

chu11 commented Mar 3, 2020

kvs: rename kvs namespace events to limit calls to flux_event_subscribe() #2779

kvs: rename kvs namespace events to limit calls to flux_event_subscribe() #2779

Comments

chu11 commented Feb 26, 2020

chu11 commented Feb 27, 2020 • edited Loading

chu11 commented Feb 27, 2020 • edited Loading

grondo commented Feb 28, 2020

chu11 commented Feb 28, 2020

chu11 commented Feb 28, 2020

grondo commented Feb 28, 2020

chu11 commented Feb 28, 2020

grondo commented Feb 28, 2020

chu11 commented Feb 29, 2020 • edited Loading

chu11 commented Mar 3, 2020

kvs: rename kvs namespace events to limit calls to `flux_event_subscribe()` #2779

kvs: rename kvs namespace events to limit calls to `flux_event_subscribe()` #2779

chu11 commented Feb 27, 2020 •

edited

Loading

chu11 commented Feb 27, 2020 •

edited

Loading

chu11 commented Feb 29, 2020 •

edited

Loading