-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flux-resource: improve performance of flux resource list
#5823
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Problem: The ResourceSetExtra class is supposed to be just a wrapper around a ResourceSet class that adds the convenience `propertiesx` and `queue` properties. However, using this class is slow because it reinvokes the underlying ResourceSet initializer, which ends up creating a unnecessary copy of the resource set from scratch. Make the class a wrapper by just stashing the original ResourceSet object and forwarding unknown getattrs to the wrapped object. This avoids the wasted time recreating the coped resource set.
Problem: The `flux resource list` command spends the majority of its time in resource_uniq_lines() on large clusters because it iterates over every rank in each state resource set in order to create the set of unique lines. Instead split the resource into smaller subsets based on all combinations of properties contained within that set. For the purposes of the current incarnation of `flux resource list` this should be the minimum number of distinct resource sets required to generate possibly unique lines of output.
Problem: lines.values() does not guarantee a sorting order, but this is passed to formatter.print_items() in `flux resource list`, which could lead to arbitrary output order. Sort output lines on (resource state, first rank) to create reproducible output.
garlick
approved these changes
Mar 24, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Nice results!
Thanks! I'll set mwp |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #5823 +/- ##
=======================================
Coverage 83.35% 83.35%
=======================================
Files 509 509
Lines 82494 82515 +21
=======================================
+ Hits 68759 68782 +23
+ Misses 13735 13733 -2
|
This was referenced Mar 25, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR improves the performance of
flux resource list
by modifying some really inefficient code:ResourceSetExtra
wrapper class unnecessarily reinitializes theResourceSet
argument. Just stash theResourceSet
instead and forward method calls to it to avoid this wasted work.resources_uniq_lines()
iterates over each individual rank in each resource set to collect common lines of output together. Instead, split a resource set into all combinations of properties and iterate each of these sets. This should be the minimum number of sets required to create possibly unique lines (at least at this point). This reduces the iteration count significantly.Timing before with a very large resource set (~10K ranks):
Timing after these changes:
Further improvements will probably have to be made in librlist itself.