Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for user-tool (forward port to main of #3440) #3465

Merged
merged 3 commits into from
Jun 27, 2023

Conversation

ndokos
Copy link
Member

@ndokos ndokos commented Jun 15, 2023

Fixes for user-tool

The tool meister was treating tools options as a concatenated string that was passed as a single argument to pbench-[install|start|stop]-tools. They are now split into separate arguments, allowing the scripts to work correctly.

pbench-postprocess-tools was being called with empty tool options. Instead of burrowing even deeper into the filesystem to grab the options, we get rid of most of that by rewriting it to use pbench-list-tools -g <group> -o and parsing the output of that script. The parsing is ugly but is arguably less ugly than the file system ad-hockery that we used to go through.

Fix failures in the python tests

test_install was using a NullObject for a logger, but the install method now logs its command. So we set up a basic logger to satisfy that need.

Two other tests (test_start and test_stop) failed on an is test. We now test for equality, rather than identity and the tests pass. I'm not sure what caused these two failures, but the cause is probably the breakup of the options into separate arguments.

Fix legacy util-scripts tests

Tests -51, -52 and -61 were failing because of the additional install logging.

Tests -53, -56 and -57 now pass: the new output looks correct to me and the old output looks wrong.

The util-scripts also pass in the CI container (run locally with jenkins/run tox -- agent util-scripts), except for -51 and -52 which fail with identical errors:

pbench-tool-meister-stop: waiting for tool-data-sink (#####) to exit

The line is missing for some reason - I hope that the real Jenkins will pass them, but it's not clear to me why they fail locally. OTOH, they do pass locally when run with tox -- agent util-scripts. Explanations/theories/guesses are welcome.

PBENCH-1172

@ndokos ndokos added bug Agent tools Of and related to the operation and behavior of various tools (iostat, sar, etc.) labels Jun 15, 2023
@ndokos ndokos added this to the v0.73 milestone Jun 15, 2023
@ndokos ndokos self-assigned this Jun 15, 2023
@ndokos ndokos modified the milestones: v0.73, v0.72 Jun 15, 2023
webbnh
webbnh previously approved these changes Jun 16, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what caused [the test_start and test_stop] failures, but the cause is probably the breakup of the options into separate arguments.

I couldn't find it either (I kinda got lost in the layers of mocking), but I concur with your instinct. However, the options appear to be "opt1;opt2" which doesn't seem like it would get much exercise (i.e., I would expect shlex.split() to leave it basically unchanged)...which, looking at the test code is probably a pretty Good Thing™, because it looks like the test isn't expecting the options to be split into separate arguments....

it's not clear to me why [-51 and -52] fail locally. OTOH, they do pass locally when run with tox -- agent util-scripts. Explanations/theories/guesses are welcome.

I routinely see different behavior between my test environment and Mr. Jenkins'. I believe that those two tests actually involve multiple processes, which means that they are potentially nondeterministic not only in terms of execution order but also in terms of output capture. So, it's possible that in your environment, the TM gets to exit before the command tries to wait, and so it doesn't issue that message.

In terms of review, I found a typo in a comment and I have a couple of code suggestions...nothing that I would block the merge for.

agent/util-scripts/pbench-postprocess-tools Outdated Show resolved Hide resolved
Comment on lines +93 to +94
# Parse an entry from the output of `pbench-list-tools` above.
# The format is: "group: <group>; host: <host>; tools: <tool> <options, <tool> <options>, ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first <options is missing its closing angle bracket.

lib/pbench/agent/tool_meister.py Outdated Show resolved Hide resolved
* Fixes for user-tool

The tool meister was treating tools options as a concatenated string that
was passed as a single argument to `pbench-install/start/stop-tools`.
They are now split into separate arguments, allowing the scripts to work
correctly.

`pbench-postprocess-tools` was being called with empty tool options.
Instead of burrowing even deeper into the filesystem to grab the options,
we get rid of most of that by rewriting it to use `pbench-list-tools -g
<group> -o` and parsing the output of that script. The parsing is ugly
but is arguably less ugly than the file system ad-hockery that we used to
go through.

* Fix failures in the python tests

`test_install` was using a NullObject for a logger, but the `install`
method now logs its command. So we set up a basic logger to satisfy
that need.

Two other tests (`test_start` and `test_stop`) failed on an `is`
test. We now test for equality, rather than identity and the tests
pass. I'm not sure what caused these two failures, but the cause is
probably the breakup of the options into separate arguments.

* Fix  legacy util-scripts tests

Tests -51, -52 and -61 were failing because of the additional `install`
logging.

Tests -53, -56 and -57 now pass: the new output looks correct to me and
the old output looks wrong.

The util-scripts also pass in the CI container (run locally with
`jenkins/run tox -- agent util-scripts`), except for -51 and -52 which
fail with identical errors:

- pbench-tool-meister-stop: waiting for tool-data-sink (#####) to exit

The line is missing for some reason - I hope that the real Jenkins will
pass them, but it's not clear to me why they fail locally. OTOH, they do
pass locally when run with `tox -- agent util-scripts`.
Explanations/theories/guesses are welcome.

PBENCH-1172
@ndokos
Copy link
Member Author

ndokos commented Jun 26, 2023

This is a rebase plus a second commit to address @webbnh's review comments.

Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good...except for this nit which is still there. 🙂

@ndokos ndokos requested a review from webbnh June 26, 2023 21:51
@ndokos ndokos merged commit 74ce886 into distributed-system-analysis:main Jun 27, 2023
@ndokos ndokos deleted the fwd3440 branch June 27, 2023 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agent bug tools Of and related to the operation and behavior of various tools (iostat, sar, etc.)
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants