Fixes for user-tool (forward port to main of #3440) #3465

ndokos · 2023-06-15T06:48:35Z

Fixes for user-tool

The tool meister was treating tools options as a concatenated string that was passed as a single argument to pbench-[install|start|stop]-tools. They are now split into separate arguments, allowing the scripts to work correctly.

pbench-postprocess-tools was being called with empty tool options. Instead of burrowing even deeper into the filesystem to grab the options, we get rid of most of that by rewriting it to use pbench-list-tools -g <group> -o and parsing the output of that script. The parsing is ugly but is arguably less ugly than the file system ad-hockery that we used to go through.

Fix failures in the python tests

test_install was using a NullObject for a logger, but the install method now logs its command. So we set up a basic logger to satisfy that need.

Two other tests (test_start and test_stop) failed on an is test. We now test for equality, rather than identity and the tests pass. I'm not sure what caused these two failures, but the cause is probably the breakup of the options into separate arguments.

Fix legacy util-scripts tests

Tests -51, -52 and -61 were failing because of the additional install logging.

Tests -53, -56 and -57 now pass: the new output looks correct to me and the old output looks wrong.

The util-scripts also pass in the CI container (run locally with jenkins/run tox -- agent util-scripts), except for -51 and -52 which fail with identical errors:

pbench-tool-meister-stop: waiting for tool-data-sink (#####) to exit

The line is missing for some reason - I hope that the real Jenkins will pass them, but it's not clear to me why they fail locally. OTOH, they do pass locally when run with tox -- agent util-scripts. Explanations/theories/guesses are welcome.

PBENCH-1172

webbnh

I'm not sure what caused [the test_start and test_stop] failures, but the cause is probably the breakup of the options into separate arguments.

I couldn't find it either (I kinda got lost in the layers of mocking), but I concur with your instinct. However, the options appear to be "opt1;opt2" which doesn't seem like it would get much exercise (i.e., I would expect shlex.split() to leave it basically unchanged)...which, looking at the test code is probably a pretty Good Thing™, because it looks like the test isn't expecting the options to be split into separate arguments....

it's not clear to me why [-51 and -52] fail locally. OTOH, they do pass locally when run with tox -- agent util-scripts. Explanations/theories/guesses are welcome.

I routinely see different behavior between my test environment and Mr. Jenkins'. I believe that those two tests actually involve multiple processes, which means that they are potentially nondeterministic not only in terms of execution order but also in terms of output capture. So, it's possible that in your environment, the TM gets to exit before the command tries to wait, and so it doesn't issue that message.

In terms of review, I found a typo in a comment and I have a couple of code suggestions...nothing that I would block the merge for.

agent/util-scripts/pbench-postprocess-tools

webbnh · 2023-06-16T21:30:20Z

agent/util-scripts/pbench-postprocess-tools

+	# Parse an entry from the output of `pbench-list-tools` above.
+	# The format is: "group: <group>; host: <host>; tools: <tool> <options, <tool> <options>, ...


The first <options is missing its closing angle bracket.

lib/pbench/agent/tool_meister.py

* Fixes for user-tool The tool meister was treating tools options as a concatenated string that was passed as a single argument to `pbench-install/start/stop-tools`. They are now split into separate arguments, allowing the scripts to work correctly. `pbench-postprocess-tools` was being called with empty tool options. Instead of burrowing even deeper into the filesystem to grab the options, we get rid of most of that by rewriting it to use `pbench-list-tools -g <group> -o` and parsing the output of that script. The parsing is ugly but is arguably less ugly than the file system ad-hockery that we used to go through. * Fix failures in the python tests `test_install` was using a NullObject for a logger, but the `install` method now logs its command. So we set up a basic logger to satisfy that need. Two other tests (`test_start` and `test_stop`) failed on an `is` test. We now test for equality, rather than identity and the tests pass. I'm not sure what caused these two failures, but the cause is probably the breakup of the options into separate arguments. * Fix legacy util-scripts tests Tests -51, -52 and -61 were failing because of the additional `install` logging. Tests -53, -56 and -57 now pass: the new output looks correct to me and the old output looks wrong. The util-scripts also pass in the CI container (run locally with `jenkins/run tox -- agent util-scripts`), except for -51 and -52 which fail with identical errors: - pbench-tool-meister-stop: waiting for tool-data-sink (#####) to exit The line is missing for some reason - I hope that the real Jenkins will pass them, but it's not clear to me why they fail locally. OTOH, they do pass locally when run with `tox -- agent util-scripts`. Explanations/theories/guesses are welcome. PBENCH-1172

ndokos · 2023-06-26T16:54:06Z

This is a rebase plus a second commit to address @webbnh's review comments.

webbnh

Looks good...except for this nit which is still there. 🙂

ndokos added bug Agent tools Of and related to the operation and behavior of various tools (iostat, sar, etc.) labels Jun 15, 2023

ndokos added this to the v0.73 milestone Jun 15, 2023

ndokos self-assigned this Jun 15, 2023

ndokos modified the milestones: v0.73, v0.72 Jun 15, 2023

ndokos force-pushed the fwd3440 branch from 909c685 to 1028de8 Compare June 15, 2023 06:50

webbnh previously approved these changes Jun 16, 2023

View reviewed changes

ndokos force-pushed the fwd3440 branch from 1028de8 to 3c5b93d Compare June 26, 2023 16:51

ndokos dismissed webbnh’s stale review via fdfc915 June 26, 2023 16:55

Address review comments

52a49b2

ndokos force-pushed the fwd3440 branch from fdfc915 to 52a49b2 Compare June 26, 2023 18:01

dbutenhof approved these changes Jun 26, 2023

View reviewed changes

webbnh approved these changes Jun 26, 2023

View reviewed changes

Merge branch 'main' into fwd3440

25a83d3

ndokos requested a review from webbnh June 26, 2023 21:51

ndokos merged commit 74ce886 into distributed-system-analysis:main Jun 27, 2023

ndokos deleted the fwd3440 branch June 27, 2023 15:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for user-tool (forward port to main of #3440) #3465

Fixes for user-tool (forward port to main of #3440) #3465

ndokos commented Jun 15, 2023 •

edited by webbnh

Loading

webbnh left a comment

webbnh Jun 16, 2023

ndokos commented Jun 26, 2023

webbnh left a comment

		# Parse an entry from the output of `pbench-list-tools` above.
		# The format is: "group: <group>; host: <host>; tools: <tool> <options, <tool> <options>, ...

Fixes for user-tool (forward port to main of #3440) #3465

Fixes for user-tool (forward port to main of #3440) #3465

Conversation

ndokos commented Jun 15, 2023 • edited by webbnh Loading