-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDL timeouts on build #5219
Comments
/cc @sunandabalu |
Right now ASP is not using any binary artifact checks, but SDL checks still download all artifacts produced. This took half an hour. Probably the easiest would be to disable asset download with the changes that got merged in #5158 which would save a ton of time (download and extraction of all nuget/zips). |
We're also seeing I'm escalating this to blocking, and 3 of our last 4 internal builds have failed due to SDL validation issues. |
To unblock, it's probably best to turn off SDL until root cause can be done. |
@markwilkie would setting https://github.com/dotnet/aspnetcore/blob/master/.azure/pipelines/ci.yml#L819 to |
Yes, it should. |
Yes, that should turn off SDL runs |
Do we have an expectation of throughput from the SDL team for these phases? Even stronger do we have a commitment from them that these phases can execute in a specific period of time? At this point SDL is tied into our official builds which mean they factor into our two hour build time commitment. That means they need to be extremely fast in order for us to hit our goals, minutes at most. Anything aproaching an hour will cause us to miss our build times. One item i think we should consider is moving this out to a separate build definition. It can run in parallel with official builds |
Based on the discussion here: dotnet/arcade#5219 Re-enabling tracked by #20690
I agree that we're going to have to address this - one way or another. We're already pull out all non-source SDL stuff to post build, but the the rest remains. The decision we'll need to make is between forced build break so we don't build debt vs. build time and reliability. |
Not sure what you're saying here. Can you elaborate? |
It could be that SDL is not as fast as we'd like, and if we moved it out of the build, then we'll build (some) debt by nature that we're human. |
Looking at the last few failures in SDL, its seems to be consistently failing with #5220. While we fix this, you can turn off downloading and extracting of artifacts as @hoyosjs mentioned above. The time-out mentioned in this issue seems transient and odd, usually execute sdl takes ~19minutes(to run both the configured tools) but it was stuck in Policheck for 15 minutes. |
This build: https://dev.azure.com/dnceng/internal/_build/results?buildId=591024&view=logs&j=7d9eef18-6720-5c1f-4d30-89d7b76728e9&t=34312c87-f0f7-51d1-6eae-738ee1c68839 took 32 minutes to download 8.2 GB. This smells like a throttling issue. That being said, 8 GB is a massive amount of artifacts to download. There is no way the SDL step needs binlogs or symbol packages. |
Agreed, hence the feature to turn off artifact download if needed was added. |
So, the fix is to merge this PR? dotnet/aspnetcore#20691 |
No, that PR stops aspnetcore from running SDL at all until this is understood. |
Ahh, okay. Then we just need to update to latest arcade and enable this feature. |
That will turn off downloads, yes and will remove this pain but we do need to address #5220 for those who want to keep downloading and extracting. |
#5220 won't fix this problem for repos that need to keep downloading the artifacts. It will still take a large amount of time to download if azure devops is throttling us. If we can't stop the throttling then the sdl step needs a higher timeout. |
There's a parameter to filter what gets downloaded. For the most part a lot of artifacts are unnecessary (think packages and blobs). Test assets and all that seems unnecessary. |
yes we do need to increase the timeout too. |
Remember - to unblock, please turn off SDL. |
We're trying, but running out of disk space on our ubuntu builds 😢 |
Can you point me to one of the builds? I want to see if it's related to other out of space issues we're seeing. I'm wondering if this out of disk is related to our two other out of disk space issues: |
See also dotnet/aspnetcore#20704 |
If there is a chance your tests go over 10GB then that could very easily cause issues. The azure devops hosted pools are only guaranteed 10GB. If you need more this job should be switched to one of our managed pools. |
We're trying to determine that in the dotnet/aspnetcore#20704 |
Based on the discussion here: dotnet/arcade#5219 Re-enabling tracked by #20690
@Pilchie is this still critical? From what I can tell the builds are unblocked, and the fix for this is just upgrading the sdk. |
We are unblocked because we disabled the job in our builds. |
BTW - we're planning on moving SDL out of the build entirely and to a promotion ring. cc/ @jcagme |
Build is unblocked. Work is in progress to improve the SDL steps. |
It seems like the timeout limit for the SDL step during build is set to 1 hour. We have been hitting this timeout limit: https://dev.azure.com/dnceng/internal/_build/results?buildId=591024&view=logs&j=7d9eef18-6720-5c1f-4d30-89d7b76728e9&t=f511b583-5060-5810-7549-865816347c8e. Is there an issue with the SDL tool or should we increase the limit?
The text was updated successfully, but these errors were encountered: