-
-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audit/rewrite PNT30
flake8 lint failures to achieve more concurrency
#18365
Labels
Comments
huonw
added a commit
that referenced
this issue
Feb 25, 2023
This fixes #7491 by implementing a flake8 rule that disallows using `await Get(...)` or `await MultiGet(...) inside the body of a loop. Generally, these should be `MultiGet` to allow them to run concurrently, AIUI. ```python # BAD: each Get runs sequentially [await Get(X, Y(i)) for i in range(10)] # GOOD: each Get runs in parallel await MultiGet(Get(X, Y(i)) for i in range(10)) ``` This rule sometimes flags correct code as an error: for instance, running formatting tools is inherently sequential as one needs to use the output of one as the input to the next. This can be allowed on a case-by-case basis via the `# noqa: PNT30` annotation. In order to get this rule landed faster (to reduce the chance of new issues appearing), I haven't tried to fix the existing issues flagged by the rule: this PR just applies that annotation to all current examples. For some that are obviously sequential, I've added a note explaining, while the rest have just `requires triage`. #18365 covers training these. I believe that sequential code will be the exception rather than the rule, and so occasional `noqa: PNT30` annotations is fine, vs. making the flake8 rule significantly more complicated (e.g. doing dataflow analysis to determine if the result of one `await` call might influence the next). For instance, one estimate: - `rg -t py MultiGet | grep -v test | wc -l` reveals 647 `MultiGet` calls - this PR only has to add 28 `# noqa: PNT30` annotations, and only 8 of these are definitely sequential (i.e. 20 of these might benefit from being `MultiGet`s) This doesn't flag loops containing `await some_helper()` where `async def some_helper(): ...` might contain various `await Get(...)`s, because there's currently no useful way to rewrite these to run concurrently (#18364).
This one was never an issue, as there is a return at the end of the for-body it effectively works as an pants/src/python/pants/engine/internals/dep_rules.py Lines 162 to 168 in eb175f5
Opted to fix the noqa text in #18112 to save on CI cycles. |
huonw
added a commit
that referenced
this issue
Apr 25, 2023
This triages/resolves a few `# noqa: PNT30` exclusions about the concurrency hazard of awaits in loops: - two instances that almost-trivially can switch from (something like) `(await Get(args) for ...)` to `await MultiGet(args for ...)` - one instance that's actually sequential This relates to #18365.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
In #18303, a new flake8 rule is added that flags constructs like
[await Get(...) for ... in ...]
. If these are embarrassingly parallel (i.e. no dependencies between iterations), they should be rewritten to useMultiGet
: for instance,await MultiGet(Get(...) for ... in ...)
.That PR added
noqa: PNT30: requires triage
annotations to many instances of this, rather than fix them. This issue is about going through and either fixing them or making the annotation more specific.Embarrassingly parallel:
src/python/paths/backend/go/goals/debug_goals.py
src/python/paths/backend/go/goals/generate.py
src/python/paths/backend/go/util_rules/cgo.py
src/python/paths/backend/go/util_rules/go_bootstrap.py
src/python/paths/backend/go/util_rules/goroot.py
src/python/paths/backend/helm/util_rules/post_renderer.py
src/python/paths/backend/python/framework/django/detect_apps.py
src/python/paths/bsp/util_rules/targets.py
src/python/paths/core/goals/deploy.py
src/python/paths/core/goals/update_build_files.py
Triage some noqa: PNT30 await-in-loops #18831src/python/paths/engine/internals/build_files.py
Triage some noqa: PNT30 await-in-loops #18831src/python/paths/engine/internals/dep_rules.py
src/python/paths/jvm/goals/lockfile.py
src/python/paths/source/source_root.py
Triage some noqa: PNT30 await-in-loops #18831Embarrassingly parallel, except there's an early return, so just moving all invocations to a
MultiGet
might result in more work total.src/python/paths/backend/cc/util_rules/toolchain.py
src/python/paths/backend/python/goals/setup_py.py
src/python/paths/core/subsystems/python_bootstrap.py
Describe the solution you'd like
For each of the files above, rewrite the loops that call
await Get(...)
internally to useMultiGet
.Describe alternatives you've considered
N/A
Additional context
N/A
The text was updated successfully, but these errors were encountered: