Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rush,node-core-library): allow weighted async concurrency #4672

Merged
merged 28 commits into from
May 5, 2024

Conversation

aramissennyeydd
Copy link
Contributor

Summary

Followup to #4092. The goal of this PR is to allow rush users to define the weight of an operation, this is especially useful for operations that do not take a fair share of the available concurrency (either CPU or memory). Examples include jest's max-workers or other CPU intensive tasks, as well as memory intensive tasks where you may want to only run one heavy task without parallelism but you don't want to drop the overall rush parallelism.

This code is very similar to David's original PR, with a few modifications

  1. the implementation is now very similar to Async.forEachAsync()
  2. Operation weight can only be whole numbers.
  3. A new WeightedOperationPlugin that adds the weight to the operation and validates that the weight is in fact a whole number and >= 0.

There was some original discussion around the name of weight, I think it's a decent name and the one I would reach for when naming this as well. I reference concurrencyUnits in my PR as the weighted sum of the current active operations.

Details

The decision to use whole numbers for concurrency is both for maintainer sanity and because I was running into a deadlock with concurrency 1 and various weights < 1. Whole numbers seems to fix this as there's no longer any risk of floating point math causing weird issues. Definitely looking for additional use cases to test that theory on though.

I didn't do it in this PR, but theoretically, Async.forEachAsync could be reimplemented in terms of Async.forEachAsync with each item having a weight of 1.

How it was tested

Added unit tests and tested that this runs on the cobuild sandbox repo.

Impacted documentation

Any references to rush parallelism may benefit from this.

@aramissennyeydd
Copy link
Contributor Author

@dmichon-msft One more for ya 😅 , let me know if you want to revive your original PR. Happy to close this if so!

Copy link
Member

@iclanton iclanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

common/reviews/api/node-core-library.api.md Outdated Show resolved Hide resolved
libraries/node-core-library/src/Async.ts Outdated Show resolved Hide resolved
@aramissennyeydd
Copy link
Contributor Author

Thought of one more case we'd want to cover, but not sure what the right path forward is.

Say for example concurrency = 4 and we have an array of items with weights of 1, 2, 4 respectively. With this implementation, all 3 operations will be queued at once.

That feels like the opposite behavior we'd want, where the operation of size 4 should be queued first, finished and then smaller tasks complete after it. Weights of 0 should be allowed to execute at the same time as a full concurrency operation.

There's a block, https://github.com/aramissennyeydd/rushstack/blob/6184642eae6682d919edada2a989426fb3c298d5/libraries/rush-lib/src/logic/operations/OperationExecutionManager.ts#L43-L48 that I think we could flip from a.criticalPathLength - b.criticalPathLength to b.criticalPathLength - a.criticalPathLength to reflect this behavior?

libraries/node-core-library/src/Async.ts Show resolved Hide resolved
libraries/node-core-library/src/Async.ts Outdated Show resolved Hide resolved
libraries/node-core-library/src/Async.ts Outdated Show resolved Hide resolved
libraries/node-core-library/src/Async.ts Outdated Show resolved Hide resolved
libraries/node-core-library/src/Async.ts Outdated Show resolved Hide resolved
libraries/node-core-library/src/Async.ts Outdated Show resolved Hide resolved
libraries/rush-lib/src/schemas/rush-project.schema.json Outdated Show resolved Hide resolved
@aramissennyeydd aramissennyeydd requested a review from iclanton May 3, 2024 19:45
libraries/node-core-library/src/Async.ts Outdated Show resolved Hide resolved
libraries/node-core-library/src/Async.ts Outdated Show resolved Hide resolved
libraries/node-core-library/src/Async.ts Show resolved Hide resolved
common/reviews/api/node-core-library.api.md Outdated Show resolved Hide resolved
Co-authored-by: Ian Clanton-Thuon <[email protected]>
@iclanton iclanton merged commit 21afaae into microsoft:main May 5, 2024
5 checks passed
@aramissennyeydd
Copy link
Contributor Author

@iclanton Thanks for the reviews on this! Excited to use this internally! 🎉

@aramissennyeydd
Copy link
Contributor Author

@iclanton Does this need a rush release? I don't see a new @microsoft/rush-lib release with the PhasedScriptAction changes.

@benkeen
Copy link

benkeen commented Sep 17, 2024

Hi @aramissennyeydd - thanks for contributing the feature! However, this option isn't yet documented and I'm not clear on exactly how to use it from the typings info:

The number of concurrency units that this operation should take up. The maximum concurrency units is determined by the -p flag."

We're porting over our build to use Rush cobuilds, which will run all work in a single phased command. However, certain operations can't be run in parallel with others: we need to disable rush parallelization (-p=1) on those operations, which led me to this option.

For cobuilds, a trick is to set the overall task to use something like -p=25% on the rush command, encouraging distribution of the work across multiple agents. I've been experimenting with different values for weight for our tasks that need -p=1, but it's extremely difficult to understand precisely what's happening - we're still getting failures consistent with what would happen if they weren't being ran with no parallelism. Your remark above made me think perhaps this isn't usable for us yet?

Say for example concurrency = 4 and we have an array of items with weights of 1, 2, 4 respectively. With this implementation, all 3 operations will be queued at once.

Any tips?

@aramissennyeydd
Copy link
Contributor Author

@benkeen I was in a similar situation which is what led me to implement this feature. Basically, we had 2 very expensive packages that were so resource intensive that if they ran on the same agent they would crash the build. The only way to move forward there was to either set parallelism to 1 (a non-starter) or go down this path.

What we've done with this is set parallelism to a known value, say 8, and then manually defined the weights for the expensive operations. The algorithm currently works such that if you set those 2 operations to weight 8, they cannot be scheduled on the same machine. That's worked very well for us.

Can you share more about your use case? Why can't run these operations run in parallel with others? Is this something that could be adjusted using the rush operation graph instead?

@benkeen
Copy link

benkeen commented Sep 17, 2024

That's encouraging! Thanks for the response :)

Yup, sounds very similar indeed. A while back we started using a new MSW-Storybook style of integration tests for our packages. There's a lot of positive things to say about them: you get more bang for your buck with the tests; they do a full test of the component with API calls; they're quite readable; you need fewer unit tests that often aren't awfully valuable in of themselves & add maintenance. But the big problem was that they take time to run (20-30 seconds isn't unusual) and don't parallelize well at all. We ended up creating a custom stage on our pipeline with -p=1 to ensure we only run 1 package's tests at a time.

What we've done with this is set parallelism to a known value, say 8, and then manually defined the weights for the expensive operations

Innnnnteresting. Our build agents have a total concurrency of 16, so it sounds like I'd do this:

  • set the rush command to -p=4 (i.e. 25% for us)
  • in the operation in rush-project.json, set weight to 4 for any operation where we don't want parallelism.

Sound about right?

@aramissennyeydd
Copy link
Contributor Author

aramissennyeydd commented Sep 17, 2024

@benkeen You've probably already run into this, but we had issues with MSW v1 memory leaks in unit tests and moved to v2 to fix some of that.

For the parallelism, that sounds right. You'll still get some other operations with <4 weight running in parallel, but we've found that that doesn't generally affect CI times and if it does you can pin those to a higher weight as well. Setting the weight = rush global parallelism will guarantee that that operation will not run in parallel with any other operation with that same weight set. Other operations can creep in if they start before the operation with weight 4 is picked up.

@benkeen
Copy link

benkeen commented Sep 17, 2024

Thanks @aramissennyeydd, appreciate the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants