Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-resource: improve performance of flux resource list #5823

Merged
merged 3 commits into from
Mar 24, 2024

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Mar 23, 2024

This PR improves the performance of flux resource list by modifying some really inefficient code:

  • The ResourceSetExtra wrapper class unnecessarily reinitializes the ResourceSet argument. Just stash the ResourceSet instead and forward method calls to it to avoid this wasted work.
  • resources_uniq_lines() iterates over each individual rank in each resource set to collect common lines of output together. Instead, split a resource set into all combinations of properties and iterate each of these sets. This should be the minimum number of sets required to create possibly unique lines (at least at this point). This reduces the iteration count significantly.

Timing before with a very large resource set (~10K ranks):

 time flux resource list --from-stdin < ../../large-list.json >/dev/null

real	0m6.878s
user	0m6.762s
sys	0m0.089s

Timing after these changes:

$ time src/cmd/flux resource list --from-stdin < ../../large-list.json >/dev/null

real	0m1.790s
user	0m1.662s
sys	0m0.102s

Further improvements will probably have to be made in librlist itself.

grondo added 3 commits March 24, 2024 02:02
Problem: The ResourceSetExtra class is supposed to be just a wrapper
around a ResourceSet class that adds the convenience `propertiesx`
and `queue` properties. However, using this class is slow because
it reinvokes the underlying ResourceSet initializer, which ends up
creating a unnecessary copy of the resource set from scratch.

Make the class a wrapper by just stashing the original ResourceSet
object and forwarding unknown getattrs to the wrapped object. This
avoids the wasted time recreating the coped resource set.
Problem: The `flux resource list` command spends the majority of its
time in resource_uniq_lines() on large clusters because it iterates
over every rank in each state resource set in order to create the
set of unique lines.

Instead split the resource into smaller subsets based on all
combinations of properties contained within that set. For the purposes
of the current incarnation of `flux resource list` this should be
the minimum number of distinct resource sets required to generate
possibly unique lines of output.
Problem: lines.values() does not guarantee a sorting order, but this
is passed to formatter.print_items() in `flux resource list`, which
could lead to arbitrary output order.

Sort output lines on (resource state, first rank) to create
reproducible output.
Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Nice results!

@grondo
Copy link
Contributor Author

grondo commented Mar 24, 2024

Thanks! I'll set mwp

Copy link

codecov bot commented Mar 24, 2024

Codecov Report

Merging #5823 (5693383) into master (a1070a3) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5823   +/-   ##
=======================================
  Coverage   83.35%   83.35%           
=======================================
  Files         509      509           
  Lines       82494    82515   +21     
=======================================
+ Hits        68759    68782   +23     
+ Misses      13735    13733    -2     
Files Coverage Δ
src/cmd/flux-resource.py 95.44% <100.00%> (+0.27%) ⬆️

... and 15 files with indirect coverage changes

@mergify mergify bot merged commit 794be81 into flux-framework:master Mar 24, 2024
34 of 35 checks passed
@grondo grondo deleted the issue#5819 branch March 24, 2024 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants