Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamline parameter passing and front-load parameter object creation #2766

Closed

Conversation

webbnh
Copy link
Member

@webbnh webbnh commented Apr 20, 2022

This PR is a re-do of portante/pbench #19, expanded to include the TM as well as the TDS, and excluding the UUID changes. The intention is that #2760 should be layered on top of this PR.

This PR front-loads the creations of the ToolDataSinkParams and ToolMeisterParams objects. This allows errors in their constructions to be detected up-front without having to undergo the constructions twice; the constructed object is then passed down to the subroutines instead of passing the parameter-block et al. which would otherwise have been used to construct and validate them.

@webbnh webbnh added Agent Tool Meister Of and relating to the Tool Meister sub-system labels Apr 20, 2022
@webbnh webbnh added this to the v0.71 milestone Apr 20, 2022
@webbnh webbnh requested a review from portante April 20, 2022 17:31
@webbnh webbnh self-assigned this Apr 20, 2022
@webbnh webbnh force-pushed the parameter-playing branch 6 times, most recently from ff710de to cb590cb Compare April 20, 2022 22:23
lib/pbench/agent/tool_meister.py Outdated Show resolved Hide resolved
lib/pbench/agent/tool_data_sink.py Outdated Show resolved Hide resolved
@webbnh webbnh force-pushed the parameter-playing branch 7 times, most recently from d2bfd03 to 81f02f5 Compare April 21, 2022 01:21
Copy link
Member Author

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes for reviewers.

@@ -2060,7 +2062,9 @@ def start(prog: Path, parsed: Arguments):
# E.g. params = '{ "channel_prefix": "some-prefix",
# "benchmark_run_dir": "/loo/goo" }'
params = json.loads(params_str)
ToolDataSink.fetch_params(params, pbench_run)

benchmark_run_dir = BenchmarkRunDir(params["benchmark_run_dir"], pbench_run)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to gloss over one "rough edge" here relative to the way Peter had structured them: if "benchmark_run_dir" is not present in the JSON object, then we'll get a KeyError raised here rather than a ToolDataSinkError.

The error will be handled with reasonable grace either way, and I think the slight inconsistency is worth the increased simplicity in the code. @portante, if you disagree, I can add either a try block or an explicit key-check before accessing params. However, I think we can just ensure that the key will be there via testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just have the ToolDataSinkParams field be a string, and then let the ToolDataSink constructor convert it to a BenchmarkRunDir object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@portante, I'm not sure what you're asking.

I think that "ToolDataSinkParams as a string" is the value in params here; passing that around creates the problem that I was trying to solve in this PR: we want to validate it here, and the best way to validate it is to construct the ToolDataSinkParams object; if we defer that construction, then we have to come up with a separate way to validate it here (and constructing it twice seems silly for a couple of reasons).

Also, the instantiation of the BenchmarkRunDir object requires the value of pbench_run; if we do that here, then we don't have to pass that value around, either (and we find out up front if there's a problem with it).

So, I think this is the best location for instantiating both the BenchmarkRunDir and the ToolDataSinkParams objects. The lingering question is, if the instantiation of the BenchmarkRunDir fails, how do we want to see that reported?...does it need to be a ToolDataSinkError or is some other error (such as KeyError) acceptable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps see webbnh#2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me get this PR split, first. Then I can look at that. 🥴

lib/pbench/agent/tool_meister.py Outdated Show resolved Hide resolved
@webbnh webbnh force-pushed the parameter-playing branch from 81f02f5 to 8450c8c Compare April 21, 2022 01:42
@webbnh webbnh marked this pull request as ready for review April 21, 2022 01:42
Copy link
Member

@portante portante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss a bit more what is happening here.

If one person's IDE has a linter that is not available to everybody else, should we be making changes based on that?

If our flake8 and black are not sufficient, then we should be able to add a linter in the same way so that all code passes the same test.

Can we consider getting that change in first before we start making so many changes based on an obscure linter?

There is a lot of change here before the end of the release, and none of it solves some bug. All of these changes are good and desirable, but given how close we are to the release are they really worth taking in now? It seems we really need to raise the bar higher for changes so that we reduce the risk of destabilizing the release.

We have been working on reducing the size of commits as well. It might be worth to break this up into:

  1. Add a linter run as part of the pre-commit check and tox along side flake8 and black, fixing everything the new linter finds
  2. Streamline parameter passing and front-load parameter object creation

lib/pbench/agent/tool_meister.py Outdated Show resolved Hide resolved
for dir, _, files in os.walk(tool_dir):
dirpath = Path(dir).relative_to(tool_dir)
for dir_, _, files in os.walk(tool_dir):
dirpath = Path(dir_).relative_to(tool_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this trailing underscore needed? Could we use dirent instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trailing underscore is not, strictly speaking needed, but it is advised as dir is a Python reserved word.

I added it under the advice of PEP-8, but I suppose we could use dirent instead. (That's less Pythonic, but it's consistent with other Python code in the project, which looks like Bash inheritance....)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dir is a built-in, not a keyword, so the spirit is good, but trailing single underscores are a terrible idea in PEP-8, especially when other names could be used.

Now leading single-underscores is a different story ... ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is addressed in #2769.

dbutenhof
dbutenhof previously approved these changes Apr 21, 2022
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not thrilled about the canonicalize name, but I've never seen another name for this sort of thing that has struck me as particularly useful; ReflectionToStringBuilder and GsonBuilder don't exactly roll off the tongue either, though I like the standard Java's explicit reference to "reflection".

named_tuple_string_builder is pretty ugly, too. Ah, well.

I get Peter's comments about the hyper linter you're using; it wouldn't be a bad idea to at least identify that somewhere. (My vscode has started exposing dust bunnies that LGTM and flake8 don't see, but frankly I'm not even sure what it's using.) I like the idea of fixing them where they're obviously real issues regardless of whether anyone else would have seen them; at worst, it does no harm.

He does have a point that if there are better lint filters it'd be nice to have the build using them... if we can figure out how to make that happen.

lib/pbench/agent/tool_meister.py Outdated Show resolved Hide resolved
@webbnh
Copy link
Member Author

webbnh commented Apr 21, 2022

If one person's IDE has a linter that is not available to everybody else, should we be making changes based on that?

Let's pose the inverse of that question: if one person's IDE has a linter that is not available to everybody else, should we be rejecting changes based on that? That seems silly if, on the face of them, there is nothing wrong with the proposed changes. So, pretend that I didn't mention the motivations for the changes, and just look and see if they make the code better.

If our flake8 and black are not sufficient, then we should be able to add a linter in the same way so that all code passes the same test.

I don't think we really want to do that. The reason why we call it "lint", instead of, say, "bugs", is that the items fall at various points along a continuum between "good" and "almost bad", and many of them aren't worth addressing. (My PR fixes only a small number of the complaints, some of them (like the use of PROG as a variable or parameter (because it's solid-caps) I don't expect us to change...).

It seems we really need to raise the bar higher for changes so that we reduce the risk of destabilizing the release.

This is a good point. I suggest that we apply that bar uniformly to all PRs (including the ones from yesterday...). But, also, I think it's reasonable to look at the content of the changes to assess the risk. (I think all of the changes I made to appease the linter are "low risk".)

We have been working on reducing the size of commits as well. It might be worth to break this up

As PRs go, this is not a large one by any means (and, much of the changes are in the gold files 😝). Nevertheless, I'll split the PR.

Add a linter run as part of the pre-commit check and tox along side flake8 and black, fixing everything the new linter finds

That would be a titanic PR. 🙃 I'm not about to even try that.

@portante
Copy link
Member

Certainly the view points on the use of linters is debatable. And the process of how to structure PRs to facilitate change being made smoothly is helped by reduced technical debt, increased test coverage, and a reliable functional test suite (which we don't have). And this PR is another step towards paying down our technical debt.

No one is rejecting this PR.

The thing is, we are proposing making this PR gate PR #2760. That no longer makes sense to me given the nature and direction of this PR.

The end goal of PR #2760 is to get an implementation of the pbench-tools-kill command posted (delivered in PR #2743). But these changes, while desirable and worth-while, are beyond the scope of end-game for the release.

I want to emphasize again, that I believe we should be making these changes.

I just don't believe we should be making this changes ahead of the PRs #2760, #2763, and #2743 and gate the v0.71 release.

Let's consider this PR independently and merge it after we get v0.71 out the door.

@webbnh
Copy link
Member Author

webbnh commented Apr 21, 2022

I'm not thrilled about the canonicalize name

@dbutenhof, I'm not in love with it either, but it seemed reasonable. And, I'm sorta hoping that we expand it beyond just NamedTuple's into something fully generalized...but there's no call for that in this PR.

I get Peter's comments about the hyper linter you're using; it wouldn't be a bad idea to at least identify that somewhere. (My vscode has started exposing dust bunnies that LGTM and flake8 don't see, but frankly I'm not even sure what it's using.) I like the idea of fixing them where they're obviously real issues regardless of whether anyone else would have seen them; at worst, it does no harm.

I have to laugh at the pejorative qualifiers that you and Peter apply to the linter! It's just a linter...haven't you guys used one before? (The man page for the original lint, under the "Bugs" section used to say, "There are some things that you just can't get lint to shut up about.") It appears to be built into or encapsulated inside PyCharm. I can't seem to find any obvious, separable identification, and it's not clear that there is only one linter involved.

Like you, I think we should address the items that it reports when they are "real issues" (whatever that actually means); and, if the resulting changes pass code review, I don't see the harm (that is, if the changes are obviously good enough to justify the code churn and the review cost, then I think we should accept them).

He does have a point that if there are better lint filters it'd be nice to have the build using them... if we can figure out how to make that happen.

I certainly agree. However, in the case of Black (and Flake8, in general), the assessments are non-negotiable. If we had some sort of static code analyzer, any bugs it reported would be on a similar level. But, with lint, a lot of it comes down to what level of orthodoxy we want to follow. (E.g., PEP-8 is supposed to be advisory...but if we require our code to be lint-free it will become mandatory, and I don't think we'll like that.) So, at least for now, I'm inclined to propose changes based on the more reasonable of my IDE's comments, and do my best to shield us from the rest of them. (E.g., do we really want to consider replacing all instances of except Exception??...'cause my IDE objects to every one of them as being "too general".... 😬)

@webbnh
Copy link
Member Author

webbnh commented Apr 21, 2022

Let's consider this PR independently and merge it after we get v0.71 out the door.

No...let's split off the objectionable portions to a separate PR which we can evaluate on its own merits and leave the rest of them to support the present v0.71 effort. And, I'm just about to do that.

@portante
Copy link
Member

He does have a point that if there are better lint filters it'd be nice to have the build using them... if we can figure out how to make that happen.

I certainly agree. However, in the case of Black (and Flake8, in general), the assessments are non-negotiable. If we had some sort of static code analyzer, any bugs it reported would be on a similar level. But, with lint, a lot of it comes down to what level of orthodoxy we want to follow. (E.g., PEP-8 is supposed to be advisory...but if we require our code to be lint-free it will become mandatory, and I don't think we'll like that.) So, at least for now, I'm inclined to propose changes based on the more reasonable of my IDE's comments, and do my best to shield us from the rest of them. (E.g., do we really want to consider replacing all instances of except Exception??...'cause my IDE objects to every one of them as being "too general".... 😬)

Great, this is a great argument for addressing these kinds of changes after we release v0.71. =)

@webbnh webbnh force-pushed the parameter-playing branch from 9556b33 to 5a5fd5e Compare April 21, 2022 17:10
@webbnh
Copy link
Member Author

webbnh commented Apr 21, 2022

Great, this is a great argument for addressing these kinds of changes after we release v0.71. =)

Um, only if "these kinds of changes" refers to adding an additional linter to our CI tool-chain. If "these kinds of changes" refers to improving the quality of our code, then not so much -- those should be looked at on an individual basis.

In any case, I've split the "linting" changes off into a separate branch, so they are no longer part of this PR.

@webbnh webbnh requested review from portante and dbutenhof April 21, 2022 17:21
@webbnh webbnh changed the title Pick lint, streamline parameter passing, and front-load parameter object creation Streamline parameter passing and front-load parameter object creation Apr 21, 2022
Copy link
Member

@portante portante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with the PR posted to fix the promotion of BenchmarkRunDir object outside of the ToolDataSink class, it still feels like this work is best suited to land after v0.71 where we can work out the changes over time and refine this.

I don't feel this should block PR #2760.

We have already delayed work by two days now towards landing PR #2743, and we have other work for v0.71 that needs to land.

Let's hold our noses and git-r-done.

@webbnh
Copy link
Member Author

webbnh commented Apr 22, 2022

Rather than do this work here, let's do it in #2760, which is where it belongs.

@webbnh webbnh closed this Apr 22, 2022
@webbnh webbnh deleted the parameter-playing branch July 13, 2022 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agent Code Infrastructure enhancement Tool Meister Of and relating to the Tool Meister sub-system
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants