Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix conditions to include suceeded on send to helix inner #45432

Merged
merged 2 commits into from
Dec 2, 2020

Conversation

safern
Copy link
Member

@safern safern commented Dec 1, 2020

Since this doesn't include suceeded, this step runs regardless of previous steps result if the condition is true. This leads to harder diagnosis of why send to helix failed or miss leading on what the actual build failure was. i.e: https://dev.azure.com/dnceng/public/_build/results?buildId=901142&view=logs&j=545e4d1e-a16d-5a06-75f8-2e440306c0fa&t=844a804f-fe4a-5f11-cab4-78484b8cc7ab

That build, failed to download artifacts, but at a first sight it seems like send to helix failed, and so reading the logs might be miss leading.

cc: @dotnet/runtime-infrastructure

@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

Copy link
Member

@trylek trylek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

@safern
Copy link
Member Author

safern commented Dec 1, 2020

/azp run runtime

@azure-pipelines
Copy link

Azure Pipelines failed to run 1 pipeline(s).

@@ -12,26 +12,26 @@ steps:
# TODO: Remove and consolidate this when we move to arcade via init-tools.cmd.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated, what is this about?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the original intent of the comment but send-to-helix-step.yml exists in three versions: I created this particular file by extracting the Helix triggering logic from the CoreCLR version of the script in order to unify the Windows and Unix version; two other versions come from Arcade, they reside at:

https://github.com/dotnet/runtime/blob/master/eng/common/templates/steps/perf-send-to-helix.yml
https://github.com/dotnet/runtime/blob/master/eng/common/templates/steps/send-to-helix.yml

There's not much difference among them and they could be easily unified but I believe that avoiding the code duplication between Windows and Unix requires using a subordinate script (like I did in CoreCLR) and my proposal to take the change to Arcade was not met with too much enthusiasm, see here:

dotnet/arcade#6227

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it seems like we can unify them already? We could use the script extension that we set in xplat-setup and just use single hyphen arguments which are supported, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right that we should be able to unify the Windows and Unix variants in send-to-helix-inner-step.yml. Sadly we cannot use the existing extension switch in xplat-setup though as the Windows variant uses a PowerShell, not a batch script (ps1 instead of cmd) but that's not rocket science to add to xplat-setup as another variant extension. There are also a few additional subtle differences (-ci vs. --ci) hitting the traditional annoyance that there's no easy way to write a conditional expression in YAML scripts and the "step" template doesn't support variable definitions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just add support for single hyphen arguments in the msbuild.sh script as @ViktorHofer did a while ago for the build.sh script in arcade. This was the exact reason why we added that support, to be able to have commands that work on both Unix and Windows scripts:

opt="$(echo "${1/#--/-}" | awk '{print tolower($0)}')"

@safern
Copy link
Member Author

safern commented Dec 2, 2020

OSX checked coreclr failure is: #42899

@safern safern merged commit 267583e into dotnet:master Dec 2, 2020
@safern safern deleted the FixConditionSendHelixInner branch December 2, 2020 07:17
@DrewScoggins
Copy link
Member

This broke our performance runs. We are now just failing on this step, with no real info. Logs here https://dev.azure.com/dnceng/internal/_build/results?buildId=905878&view=logs&j=7cb0efdd-4e4f-50c9-6fc6-4cc567ebc460&t=be1d8659-89c1-5b20-81e7-f5dbf2b10a3d

@trylek
Copy link
Member

trylek commented Dec 2, 2020

@DrewScoggins - how is that possible? I was under the impression that perf runs use the other script under the Arcade-synchronized folder eng/common,

https://github.com/dotnet/runtime/blob/master/eng/common/templates/steps/perf-send-to-helix.yml

@DrewScoggins
Copy link
Member

In that file we use send-to-helix-inner-step.yml,

- template: /eng/pipelines/common/templates/runtimes/send-to-helix-inner-step.yml
.

So this change still effects us.

@trylek
Copy link
Member

trylek commented Dec 2, 2020

Hmm, that is surprising, I thought that, as my previous change was never accepted to Arcade, it would be overwritten by the next Arcade update - has it not happened in the last 2 months or so? I was under the impression that we're receiving Arcade updates on a regular basis, if it's not the case I'll send out a PR for selective rollback of my earlier cleanup, it doesn't make sense keeping a different version of the script in the runtime repo than what's in Arcade.

@safern
Copy link
Member Author

safern commented Dec 2, 2020

Well, the previous step is failing, why not fix that previous step or remove it if you don't need it at all? This step shouldn't run if previous steps are failing...

@DrewScoggins
Copy link
Member

Well, the previous step is failing, why not fix that previous step or remove it if you don't need it at all? This step shouldn't run if previous steps are failing...

I'm confused, the only step I see failing is the restore blob feeds task. How this worked before this change is that we were not even running this step, as the conditions were false for our runs. See here, https://dev.azure.com/dnceng/internal/_build/results?buildId=905436&view=logs&j=7cb0efdd-4e4f-50c9-6fc6-4cc567ebc460

Hmm, that is surprising, I thought that, as my previous change was never accepted to Arcade, it would be overwritten by the next Arcade update - has it not happened in the last 2 months or so? I was under the impression that we're receiving Arcade updates on a regular basis, if it's not the case I'll send out a PR for selective rollback of my earlier cleanup, it doesn't make sense keeping a different version of the script in the runtime repo than what's in Arcade.

It looks like @ooooolivia made this change about three months ago, https://github.com/dotnet/arcade/pull/6195/files

@trylek
Copy link
Member

trylek commented Dec 2, 2020

Hmm, so we ended up in an inconsistent state because the two Arcade scripts now reference the inner runtime-only script. I think it's probably reasonable to first figure out how to make the inner step work for both flavors of runs (functional / perf). I can then work with Santi on executing the cleanup he suggested on this PR thread (unifying the Windows and Unix code path) and, in doing so, put the unified & simplified code back into the three scripts and remove the inner script. Before that happens, the simplified scripts need to be merged into the Arcade repo otherwise a build break in the perf runs will occur upon the next Arcade update.

@safern
Copy link
Member Author

safern commented Dec 2, 2020

Sorry about this. I see what the break was. I'll send a PR quickly to fix it.

@safern
Copy link
Member Author

safern commented Dec 2, 2020

#45502

@ghost ghost locked as resolved and limited conversation to collaborators Jan 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants