-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retry support #33230
Add retry support #33230
Conversation
/azp run |
Azure Pipelines successfully started running 2 pipeline(s). |
Failures seen so far: (5 runs) BlazorWasmStandalonePwaTemplate_Works failed on Mac OS with 3 retries, 2 of them were CERT changed errors, one was 30 second timeout. |
/azp run |
Azure Pipelines successfully started running 2 pipeline(s). |
Summary of this PR:
|
/azp run |
Azure Pipelines successfully started running 2 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me if this is what the team wants. My main concern is 3 retries may paper over problems getting twice as bad e.g. going from 1 failure in 3 tries to 2 failures in 3 tries without anyone noticing.
I'm also thinking the whole GetOr
part of ProjectFactoryFixture.GetOrCreateProject(...)
is a Bad Idea:tm:
if (_retryFailsUntil3 != 2) throw new Exception("NOOOOOOOO"); | ||
} | ||
|
||
private static int _canOverrideRetries = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Move to the top of the class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I departed from the usual pattern here because these are doing a bad thing in that each test has its own static int, but I felt that was cleaner than adding some kind of internal context that we can look at since this is just test code. So the grouping is really a static field per test, it would be really bad if the tests mucked with another one of these fields.
src/ProjectTemplates/BlazorTemplates.Tests/BlazorServerTemplateTest.cs
Outdated
Show resolved
Hide resolved
src/ProjectTemplates/BlazorTemplates.Tests/BlazorWasmTemplateTest.cs
Outdated
Show resolved
Hide resolved
@@ -102,6 +102,21 @@ public class Project : IDisposable | |||
try | |||
{ | |||
Output.WriteLine("Acquired DotNetNewLock"); | |||
|
|||
if (Directory.Exists(TemplateOutputDir)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't TemplateOutputDir
already very unique❔ If Path.GetRandomFileName()
isn't unique enough use Guid.NewGuid()
in ProjectFactoryFixture.GetOrCreateProject(...)
. If instead this is new needed because the retries reuse existing projects, we may want to completely remove that aspect of the fixture i.e. switch to ProjectFactoryFixture.CreateProject(...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can file a separate issue to track changing that if you want, but I don't really want to do additional major surgery (prefer making this PR just about adding/using retries)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's additional complications about switching from reusing projects since our longer term hope was to actually switch to one test creating/building the template project, and a second (ordered) test would be in charge of running/verifying the tests, we were hoping that would also reduce the flakiness, or at least help us isolate which part of the test is flaky. Unfortunately we didn't get to that point yet, but that was the longer term goal
{ | ||
var attempts = 0; | ||
var timeTaken = 0.0M; | ||
for (attempts = 0; attempts < retryAttribute.MaxRetries; attempts++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remind me: Will this loop retry an entire [Theory]
even if only one data set hits a problem❔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a per test case/fact (at least that's the intent), I'm pretty certain of that, since there was a different extensibility point I was looking at in a different iteration of trying to implement this.
@markwilkie @garath @ChadNedzlek @missymessa does this duplicate near-term work on Dev WF❔ |
When the helix retry stuff is ready, we should definitely switch to just using the schema/rules as it is much more powerful, but this is basically a way for us to do a simple retry at an xunit level (this is akin to local reruns only) |
This is 100% duplicating work that dev WF should have a solution for in the next few weeks. I'd really rather we not invent a second mechanism for doing the same thing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you don't want to use the retry mechanism arcade is going to be providing in a couple weeks? It's very unfortunate to have all this duplicated work already.
@ChadNedzlek this won't prevent us from using the arcade work at all, but this simple retry will work for our test jobs that aren't on helix, it wasn't clear to me if the test-configuration.json stuff is going to be helix only? If so, we might still benefit from retry for some of our tests that aren't on helix (components for example). But most importantly, right now 100% of our blazor template tests are in quarantine, and its been that way for a few previews already, given that this is a pretty cheap fix, I think its worth temporarily using this to get some of our template tests back online (and switch to using the test-configuration.json stuff once its available) |
It seems like an offline discussion about your needs around retry stuff would be fruitful. |
Yea, outside of this PR it'd be really great to understand the delta (if any) between this retry stuff and what's being implemented as part of dev wf. Sounds like rests run outside of Helix might be one, but it'd be awesome to understand that better. Is this something you think would be useful to do @HaoK ? |
@markwilkie @ChadNedzlek I don't think there's actually any gaps in the current retry plan, this PR is more of a point and time thing to try and get some blazor templates coverage back up as soon as possible, the dev wf plan currently proposed seems like a superset (on helix) and looks great. Our goal is to eventually move all our tests onto helix, and the xunit extensibility retry in this PR seems like it would map exactly onto using local reruns, so we could just turn this off once the dev wf retry stuff is available to try out. |
Thanks @HaoK ! This makes sense. Let's be sure and continue to work together as we too would love to see more tests run in a consistent fashion across .NET. Cheers |
/// Runs a test multiple times when it fails | ||
/// This can be used on an assembly, class, or method name. Requires using the AspNetCore test framework. | ||
/// </summary> | ||
[EditorBrowsable(EditorBrowsableState.Never)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because this ships in a user visible package? Making it Never makes it really hard to type this in regular code and gives some people PTSD (@rynowak).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copied
[EditorBrowsable(EditorBrowsableState.Never)] |
Co-authored-by: Pranav K <[email protected]>
See if retries help with flaky template tests
Part of #30882