Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flakyness of DnsGetHostAddresses_PostCancelledToken_Throws #70044

Merged
merged 2 commits into from
Jun 1, 2022

Conversation

rzikm
Copy link
Member

@rzikm rzikm commented May 31, 2022

Ignore cases when the call returns (throws/fails) before we manage to cancel it.

Fixes #69993.

@ghost
Copy link

ghost commented May 31, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Ignore cases when the call returns (throws/fails) before we manage to cancel it.

Fixes #69993.

Author: rzikm
Assignees: -
Labels:

area-System.Net

Milestone: -

@rzikm rzikm merged commit a466063 into dotnet:main Jun 1, 2022
Comment on lines +193 to +194
if (ex is OperationCanceledException oce)
{
Copy link
Member

@antonfirsov antonfirsov Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit late to the party, but the test will now no longer fail if the implementation lacks actual cancellation support, so it became mostly useless.

In theory, [Collection(nameof(DisableParallelization))] was here to somewhat mitigate the race. I wonder if there is something specific about the OS-es/machines/queues where this is failing?

I see Ubuntu failures in kusto, which doesn't make sense to me, since the test is marked with [ActiveIssue(***, TestPlatforms.AnyUnix)] which AFAIK should disable the test on Linux.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have re-enabled the tests in #70009, since the issue mentioned was already closed (and the feature presumably implemented)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is something specific about the OS-es/machines/queues where this is failing?

It is failing in jitstress, which may increase the chance of the cancelling thread losing the race.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the test will now no longer fail if the implementation lacks actual cancellation support, so it became mostly useless.

I have missed that :/, do you have any suggestions how to bring it back without the flakyness?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is failing in jitstress, which may increase the chance of the cancelling thread losing the race.

Is there a way to detect if we are in a jitstress run, and skip the test in those cases only? It feels to me like jitstress is fundamentally incompatible with timing-critical networking tests, which we have many in fact.

I have re-enabled the tests in #70009, since the issue mentioned was already closed (and the feature presumably implemented)

#34633 has been reverted, there is still no async name resolution and cancellation support on Linux.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I have fallen to another trap... I will revert that PR then.

We should be able to detect jitstress by looking at envvars, I will take a look

Copy link
Member

@antonfirsov antonfirsov Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was super-confusing. We may consider reopening #33378, but it's very unlikely that we'll address in the near future, so probably it's better to just skip the test on Unix with [PlatformSpecific] + a comment.

rzikm added a commit to rzikm/dotnet-runtime that referenced this pull request Jun 1, 2022
rzikm added a commit that referenced this pull request Jun 2, 2022
* Revert "Reenable GetHostAddressesTest_Cancellation on Unix (#70009)"

This reverts commit e8b9592.

* Revert "Fix flakyness of DnsGetHostAddresses_PostCancelledToken_Throws (#70044)"

This reverts commit a466063.

* Disable flaky tests on jitstress

* fixup! Disable flaky tests on jitstress
@ghost ghost locked as resolved and limited conversation to collaborators Jul 1, 2022
@karelz karelz added this to the 7.0.0 milestone Jul 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
4 participants