Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(perf): Unexpanding pipeline without StageContext #3307

Merged
merged 1 commit into from
Nov 19, 2019

Conversation

marchello2000
Copy link
Contributor

When calling the following endpoint: v2/applications/{}/pipelines with expand=false
the TaskController will try "unexpand" the pipelines by removing "context" from all the stages.
However, there is a wrinkle: during removal of the context we want to preserve the group key (if it exists).
In order to check if the key is there, we make a call to context.group. The problem is that the context is actually a
StageContext and the .group is actually a StageContext::get("group").
This particular call traverses the entire execution graph to see if group exists not only in the current stage context
but also in the outputs of any of it's parent stages. Well, if the stage graph is large this is a O(n^2) operation.

The kicker is that the group should never occur in the outputs (and if it does we don't care).
This stage replaces the implicit groovy call to .get with an explicit call to .containsKey which doesn't traverse the stage hierarchy.

The result (on a 300 stage Kayenta canary pipeline):
Before: 17.2s to fetch the last 2 executions
After: 0.2s to fetch the last 2 executions

When calling the following endpoint: `v2/applications/{}/pipelines` with `expand=false`
the `TaskController` will try "unexpand" the pipelines by removing "context" from all the stages.
However, there is a wrinkle: during removal of the context we want to preserve the `group` key (if it exists).
In order to check if the key is there, we make a call to `context.group`. The problem is that the `context` is actually a
`StageContext` and the `.group` is actually a `StageContext::get("group")`.
This particular call traverses the entire execution graph to see if `group` exists not only in the current stage context
but also in the outputs of any of it's parent stages. Well, if the stage graph is large this is a O(n^2) operation.

The kicker is that the `group` should never occur in the outputs (and if it does we don't care).
This stage replaces the implicit groovy call to `.get` with explicit call to `.containsKey` which doesn't traverse the stage hierarchy.

The result (on a 300 stage Kayenta canary pipeline):
Before: 17.2s to fetch the last 2 executions
After:   0.2s to fetch the last 2 executions
@marchello2000
Copy link
Contributor Author

marchello2000 commented Nov 19, 2019

Also see semi-related: #3306

they are related and can go together but I want to merge separately since the other change is a bit more risky

@@ -685,7 +685,7 @@ class TaskController {
pipelineExecutions.each { pipelineExecution ->
clearTriggerStages(pipelineExecution.trigger.other) // remove from the "other" field - that is what Jackson works against
pipelineExecution.getStages().each { stage ->
if (stage.context?.group) {
if (stage.context?.containsKey("group")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected to get my mind blown and this was even better 👏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca - it aims to please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be better to check & change other stage.context.prop type access and change if needed ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 501. also does the same stage.context.manualSkip.. ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be better to check & change other stage.context.prop type access and change if needed ?

ugh - this is virtually impossible... because we have no typed stage contexts it's impossible to know what a stage expects... but yes, it probably is almost always wrong... but it's a dangerous change.
ironically... this was also caused by same bug (#3224)
also why i introduced https://github.com/spinnaker/orca/blob/master/orca-core/src/main/java/com/netflix/spinnaker/orca/pipeline/model/StageContext.java#L74...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For manualSkip I am not actually sure what the intention is (but 80% we shouldn't be looking upstream)
But I would rather not change it because, historically, that means we break someone. While this change is only used in the retrieval codepath

@marchello2000 marchello2000 added the ready to merge Approved and ready for merge label Nov 19, 2019
@mergify mergify bot merged commit 8c311b8 into spinnaker:master Nov 19, 2019
@mergify mergify bot added the auto merged Merged automatically by a bot label Nov 19, 2019
@marchello2000 marchello2000 deleted the mark/unexpand_with_perf branch November 19, 2019 23:13
sergiopena pushed a commit to sergiopena/orca that referenced this pull request Dec 13, 2019
When calling the following endpoint: `v2/applications/{}/pipelines` with `expand=false`
the `TaskController` will try "unexpand" the pipelines by removing "context" from all the stages.
However, there is a wrinkle: during removal of the context we want to preserve the `group` key (if it exists).
In order to check if the key is there, we make a call to `context.group`. The problem is that the `context` is actually a
`StageContext` and the `.group` is actually a `StageContext::get("group")`.
This particular call traverses the entire execution graph to see if `group` exists not only in the current stage context
but also in the outputs of any of it's parent stages. Well, if the stage graph is large this is a O(n^2) operation.

The kicker is that the `group` should never occur in the outputs (and if it does we don't care).
This stage replaces the implicit groovy call to `.get` with explicit call to `.containsKey` which doesn't traverse the stage hierarchy.

The result (on a 300 stage Kayenta canary pipeline):
Before: 17.2s to fetch the last 2 executions
After:   0.2s to fetch the last 2 executions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto merged Merged automatically by a bot ready to merge Approved and ready for merge target-release/1.18
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants