-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP 500s from AzDO Test Results API #10358
Comments
Should we turn this into a Known Issue? |
I'd say only if we can make those grep through the Helix (non-console) logs, otherwise it's indistinguishable from a crash or other non-test-failure failure. |
Which should be rolling out this week! |
* Fix versioning errors in workloads * Disable TRX tests while reporting to AZDO is broken (#10358) (#10380) Co-authored-by: Matt Galbraith <[email protected]>
* Fix versioning errors in workloads * Disable TRX tests while reporting to AZDO is broken (dotnet#10358) (dotnet#10380) Co-authored-by: Matt Galbraith <[email protected]>
* Refactoring workload build tasks (#8645) * Refactoring workload build tasks * Fix source build and some random cleanup * Updating tests, code cleanup * Minor fixes, unit test conversion * Mark tests as Windows only, fix missing content for Helix * Hide WiX and test packages from Solution Explorer * Fix duplicate publish items * Fix link target for helix * Fix link metadata for WiX * Pass ICE suppressions to Light, more cleanup * Fix file extraction for packs, add unit test for template pack MSI * Pass ICE suppressions to Light (#9061) * Create workload pack group installers (#9514) * Remove duplicate PackageReference * Create MSIs for workoad pack groups * Build NuGet wrapper packages for workload pack group MSIs * Generate WorkloadPackGroups.json in manifest MSIs * Add swix authoring for workload pack groups * De-duplicate workload pack group creation * Put braces around ProductCode and UpgradeCode registry values * Write registry keys for pack groups * Fix swix dependencies for pack groups * Use correct GUID format when setting candle variables * Add test for creating pack group dependency in SWR file * Support building with missing workload packs (#9628) * Support building with missing workload packs * Include extracted manifest files in manifest MSI payload nupkg * Fix versioning errors in workloads (#10363) * Fix versioning errors in workloads * Disable TRX tests while reporting to AZDO is broken (#10358) (#10380) Co-authored-by: Matt Galbraith <[email protected]> * clean up, api changes Co-authored-by: Daniel Plaisted <[email protected]> Co-authored-by: Matt Galbraith <[email protected]>
No updates on the IcM, problem continues to (sporadically) occur for scenarios outside of the Arcade artificial TRX scenario. |
This problem started Aug 9, and has happened 3,000 times a day. Hopefully we can get some traction there. Given how "big" PR's are, even a small incidence of this bubbles up into a lot of failed PRs. |
Here's a chart showing how many jobs are impacted Jobs impacted It's nearly 300 builds a day, this is unacceptable and needs to be elevated. |
Update the description to reflect the current severity. |
Shouldn't this be a sev2? cc/ @Chrisboh Is it possible to have a known issue tracking the builds affected? cc/ @ulisesh thanks a ton @ChadNedzlek for figuring the impact here |
Yeah Chad got us the data last night to confirm this is sev 2 and Stu is raising that now and getting on the bridge. |
Unfortunately, the error happens in the Helix client and test known issues is design to identify problems in the tests |
The team has evidence that the root cause is related to an incomplete fix for the problems described in #9865. They are rolling-back the fix, which should resolve this issue but, unfortunately, bring back the original. They will continue to treat as Sev 2. |
The rollback was successful, no hits on this over the weekend. |
This came back on 9/1/2022 and we didn't notice. Reopening (@Chrisboh for visibility) |
It's back in dnceng-public so they asked me to file a new IcM as they claim the root cause is different (we can't tell; we get 500s). Filed https://portal.microsofticm.com/imp/v3/incidents/details/335170304/home to track this |
I think I see the actual issue; created #10916 to track this. |
Chad is pursuing #10916 , closing this one in favor of that as it's a new variation. |
Tracking IcM: https://portal.microsofticm.com/imp/v3/incidents/details/326663396/home
This seems to be impacting approximately 1 in 3 builds:
Query
The text was updated successfully, but these errors were encountered: