-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix conditions to include suceeded on send to helix inner #45432
Conversation
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this!
/azp run runtime |
Azure Pipelines failed to run 1 pipeline(s). |
@@ -12,26 +12,26 @@ steps: | |||
# TODO: Remove and consolidate this when we move to arcade via init-tools.cmd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated, what is this about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is the original intent of the comment but send-to-helix-step.yml exists in three versions: I created this particular file by extracting the Helix triggering logic from the CoreCLR version of the script in order to unify the Windows and Unix version; two other versions come from Arcade, they reside at:
https://github.com/dotnet/runtime/blob/master/eng/common/templates/steps/perf-send-to-helix.yml
https://github.com/dotnet/runtime/blob/master/eng/common/templates/steps/send-to-helix.yml
There's not much difference among them and they could be easily unified but I believe that avoiding the code duplication between Windows and Unix requires using a subordinate script (like I did in CoreCLR) and my proposal to take the change to Arcade was not met with too much enthusiasm, see here:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it seems like we can unify them already? We could use the script extension that we set in xplat-setup
and just use single hyphen arguments which are supported, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right that we should be able to unify the Windows and Unix variants in send-to-helix-inner-step.yml
. Sadly we cannot use the existing extension switch in xplat-setup though as the Windows variant uses a PowerShell, not a batch script (ps1 instead of cmd) but that's not rocket science to add to xplat-setup as another variant extension. There are also a few additional subtle differences (-ci vs. --ci) hitting the traditional annoyance that there's no easy way to write a conditional expression in YAML scripts and the "step" template doesn't support variable definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can just add support for single hyphen arguments in the msbuild.sh
script as @ViktorHofer did a while ago for the build.sh
script in arcade. This was the exact reason why we added that support, to be able to have commands that work on both Unix and Windows scripts:
Line 84 in 8bc9834
opt="$(echo "${1/#--/-}" | awk '{print tolower($0)}')" |
OSX checked coreclr failure is: #42899 |
This broke our performance runs. We are now just failing on this step, with no real info. Logs here https://dev.azure.com/dnceng/internal/_build/results?buildId=905878&view=logs&j=7cb0efdd-4e4f-50c9-6fc6-4cc567ebc460&t=be1d8659-89c1-5b20-81e7-f5dbf2b10a3d |
@DrewScoggins - how is that possible? I was under the impression that perf runs use the other script under the Arcade-synchronized folder eng/common, |
In that file we use send-to-helix-inner-step.yml,
So this change still effects us. |
Hmm, that is surprising, I thought that, as my previous change was never accepted to Arcade, it would be overwritten by the next Arcade update - has it not happened in the last 2 months or so? I was under the impression that we're receiving Arcade updates on a regular basis, if it's not the case I'll send out a PR for selective rollback of my earlier cleanup, it doesn't make sense keeping a different version of the script in the runtime repo than what's in Arcade. |
Well, the previous step is failing, why not fix that previous step or remove it if you don't need it at all? This step shouldn't run if previous steps are failing... |
I'm confused, the only step I see failing is the restore blob feeds task. How this worked before this change is that we were not even running this step, as the conditions were false for our runs. See here, https://dev.azure.com/dnceng/internal/_build/results?buildId=905436&view=logs&j=7cb0efdd-4e4f-50c9-6fc6-4cc567ebc460
It looks like @ooooolivia made this change about three months ago, https://github.com/dotnet/arcade/pull/6195/files |
Hmm, so we ended up in an inconsistent state because the two Arcade scripts now reference the inner runtime-only script. I think it's probably reasonable to first figure out how to make the inner step work for both flavors of runs (functional / perf). I can then work with Santi on executing the cleanup he suggested on this PR thread (unifying the Windows and Unix code path) and, in doing so, put the unified & simplified code back into the three scripts and remove the inner script. Before that happens, the simplified scripts need to be merged into the Arcade repo otherwise a build break in the perf runs will occur upon the next Arcade update. |
Sorry about this. I see what the break was. I'll send a PR quickly to fix it. |
Since this doesn't include suceeded, this step runs regardless of previous steps result if the condition is true. This leads to harder diagnosis of why send to helix failed or miss leading on what the actual build failure was. i.e: https://dev.azure.com/dnceng/public/_build/results?buildId=901142&view=logs&j=545e4d1e-a16d-5a06-75f8-2e440306c0fa&t=844a804f-fe4a-5f11-cab4-78484b8cc7ab
That build, failed to download artifacts, but at a first sight it seems like send to helix failed, and so reading the logs might be miss leading.
cc: @dotnet/runtime-infrastructure