libkz: defer kvs_watch until necessary #1424

garlick · 2018-04-04T03:06:47Z

As discussed in #1420, this PR changes libkz so that the kz_ready_f registration does not immediately trigger a kvs_watch() on the stream directory. Instead, it uses chained KVS lookups and continuations to iterate over the stream until the available "blocks" are exhausted. At that point if EOF is not reached, a kvs_watch() is installed as before. However if EOF is reached, the kvs_watch() is entirely avoided.

This should allow flux wreck attach to process the output of a completed job without ever using kvs_watch(), provided that each stream has reached EOF (which might not be the case until we fix #1419). Hopefully this is more scalable.

It might also change the timing for the "redirected" case since the kvs_watch() calls (with embedded synchronous RPC) will only occur during I/O processing, not during initialization, and possibly more staggered in time.

I haven't tested this yet at scale but it seems to be working in my small tests. Will try to do some scale testing tomorrow.

garlick · 2018-04-04T03:22:22Z

Oops, here's a test failure in lua/t1003-iowatcher.t I need to run down:

not ok 4 - iowatcher returns error correctly
      got: userdata: 0x9370a94
 expected: nil

coveralls · 2018-04-04T03:31:31Z

Coverage increased (+0.002%) to 78.902% when pulling 4ea96cd on garlick:libkz_nowatch into af89104 on flux-framework:master.

garlick · 2018-04-04T15:17:34Z

I made lua/t1003-iowatcher.t a todo for the moment. The test tries to create an I/O watcher watching a non-directory and expects the iowatcher creation to fail. That used to happen at kz_set_ready_cb() time when flux_kvs_watch() was called, which caused a synchronous lookup to occur. Now that the initial lookup result is handled in a continuation, the users callback will be called, and when it calls kz_get(), that will fail.

It's not clear to me looking at the iowatcher implementation whether kz_get() errors are checked. If not, that will need to be fixed and may help us find other reader issues.

Then the test will have to start the reactor and check for error in the reactor callback.

codecov-io · 2018-04-04T15:42:11Z

Codecov Report

Merging #1424 into master will increase coverage by <.01%.
The diff coverage is 75.47%.

@@            Coverage Diff            @@
##           master   #1424      +/-   ##
=========================================
+ Coverage   78.59%   78.6%   +<.01%     
=========================================
  Files         163     163              
  Lines       30082   30105      +23     
=========================================
+ Hits        23643   23663      +20     
- Misses       6439    6442       +3

Impacted Files	Coverage Δ
src/common/libkz/kz.c	`82.55% <73.95%> (+12.02%)`	⬆️
src/bindings/lua/flux-lua.c	`80.97% <90%> (-0.31%)`	⬇️
src/cmd/cmdhelp.c	`79.13% <0%> (-5.22%)`	⬇️
src/modules/cron/cron.c	`76.68% <0%> (-3.04%)`	⬇️
src/common/libkvs/kvs_watch.c	`90.12% <0%> (-1.29%)`	⬇️
src/common/libflux/mrpc.c	`85.49% <0%> (-1.18%)`	⬇️
src/broker/modservice.c	`79.61% <0%> (-0.98%)`	⬇️
src/broker/module.c	`83.79% <0%> (-0.28%)`	⬇️
src/common/libflux/message.c	`81.6% <0%> (+0.11%)`	⬆️
... and 4 more

grondo · 2018-04-04T15:52:49Z

Sorry you had to deal with that @garlick. Your approach seems good to me

garlick · 2018-04-04T15:54:01Z

NP!

trws · 2018-04-04T21:37:15Z

This looks really good! How hard would it be to have an option that says "even if it isn't done, don't establish the watchers, just stop" with this change in?

garlick · 2018-04-04T22:55:19Z

This looks really good! How hard would it be to have an option that says "even if it isn't done, don't establish the watchers, just stop" with this change in?

Probably not hard. I'll look at that.

garlick · 2018-04-04T23:14:43Z

@trws if you have the output of one of these completed jobs in a KVS somewhere, it would be interesting to know before and after numbers on flux wreck attach.

trws · 2018-04-04T23:20:56Z

I have an instance up that’s run 18 jobs, up to 4000 tasks each. Several have non-trivial output, would it be useful to pull something out of there?

…

On 4 Apr 2018, at 16:14, Jim Garlick wrote: @trws if you have the output of one of these completed jobs in a KVS somewhere, it would be interesting to know before and after numbers on `flux wreck attach`. -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #1424 (comment)

garlick · 2018-04-04T23:24:01Z

I have an instance up that’s run 18 jobs, up to 4000 tasks each.
Several have non-trivial output, would it be useful to pull something
out of there?

Sure - if it's not too much trouble, just time flux-wreck attach on master and this branch on the same (completed) job.

grondo · 2018-04-05T03:06:17Z

Just gave changes to kz iowatcher a quick look and it seems good to me. Just to be clear, new iowatcher callback denotes EOF when both data and err are nil, otherwise if err is not nil, then an error occurred? (I know sorry about the bad semantics of the original function.)

Just another warning that the fd iowatcher has completely different semantics (e.g. data parameter passed to callback is a Lua table, not just a string, etc.) This is because it was based on the old zio code, so I didn't want to bother fixing it yet. Just didn't want to you to be surprised if you went looking there (and hopefully you don't have to.)

Sorry that is all I have time for tonight.

garlick · 2018-04-05T03:14:48Z

Thanks @grondo!

Well that sounds right, or (what I was thinking was) you'd first check err and if non-nil, it's an error otherwise, proceed as before.

trws · 2018-04-05T06:32:46Z

Well, the timing results are pretty conclusive. With this, over ssh, attach on a 400-line output job with 400 ranks takes 0.7 seconds. The current master version has so far been running for five minutes, and I just got another line out of it, so it might eventually finish? The last two lines were a minute each in coming.

For one that actually finished reasonably, the current master took about 1.5 seconds on average, this took about 0.5 (again with this version at the disadvantage of connecting over ssh).

In a possibly related note, do either of you know if it's possible we leave watchers around for cancelled or killed wreck attach commands? I've noticed long-running instances start to get slow, spending time processing kvs events even when the kvs isn't being actively poked from outside. Made me wonder if we're somehow leaking watchers when a client dies an untimely death or something (but then again, I'm way past loopy at this point so... hey, another line!

garlick · 2018-04-05T11:29:05Z

do either of you know if it's possible we leave watchers around for cancelled or killed wreck attach commands?

In that case where the reactor has gone away and the broker connection dropped, the KVS should get a disconnect message and take care of cleaning out any state for that connection. To get a count of service side watcher state for the local rank, run:

flux module stats --parse "namespace.primary.#watchers" kvs

FWIW there is a regression test for this: t/t1103-apidisconnect.t

Problem: kz_gopen() is not used but its implementation contributes to complexity. Drop this interface and associated logic.

Problem: code to kvs_lookup next key with continuation is duplicated in continuation and watch callback, and soon in another function. Factor out common code to lookup_next() Also, drop some LOG_DEBUG messages and add sequence number to some LOG_ERR messages to make them more informative.

Problem: KVS lookups continue without incrementing kz->seq if callback is unregistered with with kz->seq < kz->last_dir_size resulting in infinite loop. Terminate lookups if callback is NULL in the lookup continuation.

Prepare for use of kz->eof flag by ensure it is set when end of stream is reached, even if data is being consumed in "raw" mode with kz_get_json.

If internal kz->eof flag is set, terminate KVS watcher.

Since KVS watches are costly, avoid setting one up until any existing data in the stream has been read and consumed. If EOF is reached, then the KVS watch is avoided entirely.

Problem: zio decode errors were ignored in getnext_blocking(). Handle that error.

Problem: errors that occur in context of a KVS continuation in the read path were not saved so they could be returned to the user on the next kz_get(), kz_get_json(), or kz_close() call. Check for saved error in kz_get(), kz_get_json(), and kz_close(). When a continuation encounters an error, save it, and call the user's callback. A couple of inappropriate log messages were removed also and some error paths further cleaned up.

Problem: many of the public kz functions will crash if passed NULL arguments. Add checks for absurd argument values.

Problem: if an iowatcher internally gets a kz_get() error, it has no way to inform the user. Add an err argument to the iowatcher callback that is nil when there is no error, and set to an error string when there is one.

Problem: lua iowatcher test 5 'iowatcher returns error correctly' fails after kvs_watch deferral change. This test attempts to create an iowatcher on a non-directory and expects iowatcher creation to fail. Now that the the lookup is no longer performed in the context of iowatcher creation, creation succeeds. Change the test's iowatcher callback to stop the reactor if 'err' is non-nil. Then change he test to run the reactor and expect an ENOTDIR failure.

Problem: kzutil --attach is not used by any tests. Drop the attach code which is hard wired for wreck's stdio naming convention, and is not used by any test. This leaves kzutil --copy as the main mode, so drop the --copy option and rename the utility to kzcopy. Update users.

Problem: sharness test driver for kzcopy explicitly sets blocksize for I/O to 4096 which is the default. Drop the -b 4096 option from kzcopy calls.

Problem: test coverage is thin for KZ_FLAGS_NONBLOCK. Add a kzcopy --non-blocking option that uses the kz ready_cb_f callbacks to read a kz stream. Add a couple more sharness tests to exercise this option.

Problem: kz EINVAL checks have no test coverage. Add a TAP test to cover invalid arguments. (Unfortunately not a lot can be covered here because kz requires a broker connection and working KVS to get very far).

garlick · 2018-04-05T12:45:02Z

Rebased, improved commit messages somewhat, and squashed some incremental development.
I'd vote that this goes in now and further improvements (such as a flag to avoid "following" the stream when EOF was not written) be deferred to another PR.

trws · 2018-04-05T16:05:15Z

We are now up to 9 hours and 46 minutes. The old version has printed 366 lines.

trws · 2018-04-05T16:14:24Z

Agreed, I can even hack something in for splash short term.

The stat command is telling me there are 740 active kvs watches. The big flux instance is currently almost unusable, and I'm trying to figure out why. It's running, it's responding, but its spending all of its time in the kvs module processing transactions. The reason this surprises me is that long-running attach and a wreck ls are the only things running, and it's pegged at 100% of one CPU continually.

garlick · 2018-04-05T21:01:12Z

@chu11, would you mind having a quick look at this and press the button if you find nothing amiss?

The test coverage is somewhat improved from where it was, but commensurate with how much effort I think we want to put into this old stuff (in other words, not quite par). I think I'm OK with that.

garlick · 2018-04-06T16:56:58Z

Ping @chu11?

garlick force-pushed the libkz_nowatch branch from 7dcf93c to 83857c8 Compare April 4, 2018 15:09

garlick mentioned this pull request Apr 4, 2018

lua iowatcher doesn't check kz_get errors #1427

Closed

garlick added 13 commits April 5, 2018 08:22

libkz: drop unused kz_gopen()

6027457

Problem: kz_gopen() is not used but its implementation contributes to complexity. Drop this interface and associated logic.

libkz: ensure lookups stop after callback unregistered

577896e

Problem: KVS lookups continue without incrementing kz->seq if callback is unregistered with with kz->seq < kz->last_dir_size resulting in infinite loop. Terminate lookups if callback is NULL in the lookup continuation.

libkz: update kz->eof in kz_get_json()

19e9b3d

Prepare for use of kz->eof flag by ensure it is set when end of stream is reached, even if data is being consumed in "raw" mode with kz_get_json.

libkz: terminate watcher at EOF

e1ea54a

If internal kz->eof flag is set, terminate KVS watcher.

libkz: avoid setting up watch until blocked

d43bdbd

Since KVS watches are costly, avoid setting one up until any existing data in the stream has been read and consumed. If EOF is reached, then the KVS watch is avoided entirely.

libkz: fix error handling in getnext_blocking()

b841b0b

Problem: zio decode errors were ignored in getnext_blocking(). Handle that error.

libkz: check for invalid arguments

72fe72e

Problem: many of the public kz functions will crash if passed NULL arguments. Add checks for absurd argument values.

bindings/lua: add err arg to iowatcher callback

22ff100

Problem: if an iowatcher internally gets a kz_get() error, it has no way to inform the user. Add an err argument to the iowatcher callback that is nil when there is no error, and set to an error string when there is one.

test/kz: [cleanup] drop kzcopy --blocksize 4096

e1c28fe

Problem: sharness test driver for kzcopy explicitly sets blocksize for I/O to 4096 which is the default. Drop the -b 4096 option from kzcopy calls.

garlick added 2 commits April 5, 2018 08:41

test/kz: add kzcopy --non-blocking and tests

5a04f05

Problem: test coverage is thin for KZ_FLAGS_NONBLOCK. Add a kzcopy --non-blocking option that uses the kz ready_cb_f callbacks to read a kz stream. Add a couple more sharness tests to exercise this option.

test/kz: add TAP unit test in libkz dir

4ea96cd

Problem: kz EINVAL checks have no test coverage. Add a TAP test to cover invalid arguments. (Unfortunately not a lot can be covered here because kz requires a broker connection and working KVS to get very far).

garlick force-pushed the libkz_nowatch branch from 9571a03 to 4ea96cd Compare April 5, 2018 12:41

chu11 merged commit 95f8d8f into flux-framework:master Apr 6, 2018

garlick deleted the libkz_nowatch branch April 6, 2018 21:25

garlick mentioned this pull request Apr 9, 2018

[splash] need workaround to efficiently read job stdio output from large jobs #1420

Closed

grondo mentioned this pull request May 10, 2018

0.9.0 Release #1479

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libkz: defer kvs_watch until necessary #1424

libkz: defer kvs_watch until necessary #1424

garlick commented Apr 4, 2018

garlick commented Apr 4, 2018

coveralls commented Apr 4, 2018 •

edited

Loading

garlick commented Apr 4, 2018 •

edited

Loading

codecov-io commented Apr 4, 2018 •

edited

Loading

grondo commented Apr 4, 2018

garlick commented Apr 4, 2018

trws commented Apr 4, 2018

garlick commented Apr 4, 2018

garlick commented Apr 4, 2018

trws commented Apr 4, 2018 via email

garlick commented Apr 4, 2018 •

edited

Loading

grondo commented Apr 5, 2018

garlick commented Apr 5, 2018 •

edited

Loading

trws commented Apr 5, 2018

garlick commented Apr 5, 2018

garlick commented Apr 5, 2018

trws commented Apr 5, 2018

trws commented Apr 5, 2018

garlick commented Apr 5, 2018

garlick commented Apr 6, 2018

libkz: defer kvs_watch until necessary #1424

libkz: defer kvs_watch until necessary #1424

Conversation

garlick commented Apr 4, 2018

garlick commented Apr 4, 2018

coveralls commented Apr 4, 2018 • edited Loading

garlick commented Apr 4, 2018 • edited Loading

codecov-io commented Apr 4, 2018 • edited Loading

Codecov Report

grondo commented Apr 4, 2018

garlick commented Apr 4, 2018

trws commented Apr 4, 2018

garlick commented Apr 4, 2018

garlick commented Apr 4, 2018

trws commented Apr 4, 2018 via email

garlick commented Apr 4, 2018 • edited Loading

grondo commented Apr 5, 2018

garlick commented Apr 5, 2018 • edited Loading

trws commented Apr 5, 2018

garlick commented Apr 5, 2018

garlick commented Apr 5, 2018

trws commented Apr 5, 2018

trws commented Apr 5, 2018

garlick commented Apr 5, 2018

garlick commented Apr 6, 2018

coveralls commented Apr 4, 2018 •

edited

Loading

garlick commented Apr 4, 2018 •

edited

Loading

codecov-io commented Apr 4, 2018 •

edited

Loading

garlick commented Apr 4, 2018 •

edited

Loading

garlick commented Apr 5, 2018 •

edited

Loading