Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: take the concurrencyLimit from feature flags and keep in dependencies #4564

Merged
merged 1 commit into from
Mar 18, 2022

Conversation

adrian-thurston
Copy link
Contributor

@adrian-thurston adrian-thurston commented Mar 16, 2022

Pull the concurrencyLimit from feature flags at the start of the execution
process, before planning, and stash it in the ExecutionOptions, which live in
the ExecutionDependencies. From there it can then be modified by planner rules.
This allows parallelization rules to raise the limit if they parallelize a
query.

At the same time we move the defaultMemoryLimit from the planner and into the
execution options. We also move the computation of memory limit and concurrency
quota from the planner and into the executor.

Included are test cases covering the existing and the new method of
determining query concurrency.

Done checklist

  • docs/SPEC.md updated
  • Test cases written

@adrian-thurston adrian-thurston requested a review from a team as a code owner March 16, 2022 00:08
@adrian-thurston adrian-thurston requested review from scbrickley and removed request for a team March 16, 2022 00:08
@adrian-thurston adrian-thurston marked this pull request as draft March 16, 2022 00:09
@adrian-thurston adrian-thurston removed the request for review from scbrickley March 16, 2022 00:09
Copy link

@wolffcm wolffcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, it seems reasonable to me.

I am curious if @nathanielc has thoughts on this approach.

if concurrencyQuota > int(execOptions.ConcurrencyLimit) {
concurrencyQuota = int(execOptions.ConcurrencyLimit)
} else if concurrencyQuota == 0 {
concurrencyQuota = 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about this branch here. If we had a trivial query whose execution graph just had a ReadWindowAggregate node, I guess concurrencyQuota could be 0. Do we require it to be positive even if there are no non-sources?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be mistaken, but my guess is there needs to be at least one goroutine to work the consecutive transport belonging to the source nodes. The source nodes themselves I think just deposit messages to the outgoing dataset and that's as far as the source goroutines take it. There needs to be a dispatcher thread that reads those messages and writes to CSV writer.

Copy link
Contributor

@onelson onelson Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anecdotally, during recent refactors when I failed to mark any nodes as roots (my mistake) the concurrency quota was set to zero, which produced an error. It didn't seem like there was any conditional around that check since it failed for from/range/filter which I think would have be rewritten as a single source.

flux/execute/executor.go

Lines 80 to 85 in dc08c57

func validatePlan(p *plan.Spec) error {
if p.Resources.ConcurrencyQuota == 0 {
return errors.New(codes.Invalid, "plan must have a non-zero concurrency quota")
}
return nil
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right makes sense.

Given Go's convention of having a useful zero value, I wonder if we should change the meaning of concurrency quota to be the number of additional goroutines after the required one. Or maybe 0 should just mean the default of 1.

Nothing that needs to change here for this PR, it just seems a little weird.

@adrian-thurston adrian-thurston self-assigned this Mar 16, 2022
@adrian-thurston adrian-thurston changed the title wip: take the concurrencyLimit from FF and keep in dependencies feat: take the concurrencyLimit from feature flags and keep in dependencies Mar 17, 2022
@adrian-thurston adrian-thurston force-pushed the feat/modifyable-concurrency-limit branch 5 times, most recently from 15c350b to 2b61cb2 Compare March 18, 2022 01:25
Pull the concurrencyLimit from feature flags at the start of the execution
process, before planning, and stash it in the ExecutionOptions, which live in
the ExecutionDependencies. From there it can then be modified by planner rules.
This allows parallelization rules to raise the limit if they parallelize a
query.

At the same time we move the defaultMemoryLimit from the planner and into the
execution options. We also move the computation of memory limit and concurrency
quota from the planner and into the executor.

Included are test cases covering the existing and the new method of
determining query concurrency.
@adrian-thurston adrian-thurston force-pushed the feat/modifyable-concurrency-limit branch from 2b61cb2 to 9a4aad1 Compare March 18, 2022 02:12
@adrian-thurston adrian-thurston marked this pull request as ready for review March 18, 2022 02:12
Copy link

@wolffcm wolffcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Copy link
Contributor

@onelson onelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I agree with Chris that the handling of that 0 case feels a little clumsy, but I don't really see what we can do about it. 🤷

@adrian-thurston adrian-thurston merged commit db497ca into master Mar 18, 2022
@jacobmarble jacobmarble deleted the feat/modifyable-concurrency-limit branch January 4, 2024 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants