Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Provide an (at least temporary) non-global ManagedHandler opt-in mechanism #24173

Merged
merged 3 commits into from
Sep 22, 2017

Conversation

stephentoub
Copy link
Member

Right now the only way to opt-in to using ManagedHandler is via an environment variable. That requires a process-wide setting which makes it impossible to create an individual HttpClient{Handler} that uses ManagedHandler; another thread could be instantiating one at the same time and would pick up the setting. This is why we had to disable parallel execution of our System.Net.Http test suite.

We still haven't yet figured out the final model via which we'll enable usage of ManagedHandler. For now, though, I'm adding a thread-local switch, to allow opting in a single HttpClientHandler construction. This makes it possible to re-enable our test parallelism, which this PR also does (and in the process drops outerloop execution time by 3x on my machine), but it also enables us to make progress on https://github.com/dotnet/corefx/issues/23152, which is blocked by having a global-only setting.

cc: @geoffkizer, @davidsh, @wfurt, @Priya91

Copy link
Contributor

@davidsh davidsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wfurt
Copy link
Member

wfurt commented Sep 20, 2017

looks good to me. It would be nice to have some API to enable managed handler via API call.

@stephentoub stephentoub force-pushed the managedhandler_optin branch 4 times, most recently from 41346b1 to 56a6f9e Compare September 20, 2017 21:41
@davidsh
Copy link
Contributor

davidsh commented Sep 20, 2017

I assume you'll run Outerloop on all this before merge?

@stephentoub
Copy link
Member Author

I assume you'll run Outerloop on all this before merge?

Yes. I was waiting for the other legs to pass before doing so.

@stephentoub
Copy link
Member Author

@dotnet-bot test Outerloop Linux x64 Debug Build please
@dotnet-bot test Outerloop Windows x64 Debug Build please
@dotnet-bot test Outerloop OSX x64 Debug Build please

And as a result re-enable parallelism of the test suite, which on my machine reduces the running time of the outerloop tests from 150s to 45s.
@stephentoub
Copy link
Member Author

@dotnet/dnceng, can you help me to understand why the Outerloop Windows leg here is showing as failing?
https://ci3.dot.net/job/dotnet_corefx/job/master/job/windows-TGroup_netcoreapp+CGroup_Debug+AGroup_x64+TestOuter_true_prtest/70/
When I click through to it, the only failures shown are on macOS. What am I missing?

@stephentoub
Copy link
Member Author

The macOS leg is hitting "Too many open files". @dotnet/dnceng, what is the ulimit for open file handles on the macOS machines used in these test runs? This change re-enables parallelism of tests, so it's possible various tests running at the same time are causing too many handles to be opened, but that should really only happen if the limit is set really low (or if there's a bug in the tests that's resulting in handles getting leaked incorrectly).

@mmitche
Copy link
Member

mmitche commented Sep 21, 2017

@stephentoub These are helix machines, so maybe the limits are set lower?
@MattGal May have an opinion on this. I think he was looking to have the tests set up properly to invoke the process with a higher files limit rather than changing the OS globally.

@wfurt
Copy link
Member

wfurt commented Sep 21, 2017

There are at least two limits - file descriptors per process and also system wide limit.
This one seems like the first case.

@MattGal
Copy link
Member

MattGal commented Sep 21, 2017

Please see https://github.com/dotnet/corefx/blob/4454822040671e10db4fc7e5670bbe8bc0342b0c/src/System.Console/tests/Performance/System.Console.Performance.Tests.csproj#L27-29 . This approach allows you to specify an increased limit for a single test, which prevents missing problems by artificially setting this very high system-wide.

@stephentoub
Copy link
Member Author

@MattGal, what is the default on macOS in helix when this isn't done?

Also, any answer to my Windows question?

Thanks.

@MattGal
Copy link
Member

MattGal commented Sep 21, 2017

Disclaimer: Macs are manually set up and may vary slightly
Ulimit -a on several machines from our OSX 10.12 queue shows the following:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 256
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 709
virtual memory          (kbytes, -v) unlimited

The windows problem is an issue with the Jenkins plugin trying to talk to Helix. No tests actually failed, as you note, but the plugin got an HTTP 502 ("502 - Web server received an invalid response while acting as a gateway or proxy server. There is a problem with the page you are looking for, and it cannot be displayed. When the Web server (while acting as a gateway or proxy) contacted the upstream content server, it received an invalid response from the content server." ) and translated this to some sort of failure.

@ChadNedzlek may have some thoughts on this (from the Helix API perspective) but he's on vacation right this moment. In the mean time we should work to handle this response type (as it does appear to be transient) from the Jenkins plugin.

@danmoseley
Copy link
Member

Hopefully dotnet/coreclr#14054 will fix this at least for .NET Core code.

@stephentoub
Copy link
Member Author

Thanks, @MattGal.

This:

open files                      (-n) 256

is kind of ridiculously low for a runtime/libraries focused heavily on services. We previously saw tons of failures due to this macOS default in Jenkins; I'm surprised we haven't seen more in Helix. I will increase the value for this specific test project in order to unblock my change, but I think we should increase this limit across the board. I don't know of any case where having this low limit has helped us to catch bugs, and I expect we'll just find ourselves getting random failures here and there due to this, at which point we'd just increase the limit in that project, and eventually most projects will have the setting.

@stephentoub
Copy link
Member Author

@dotnet-bot test Outerloop OSX x64 Debug Build please

@stephentoub
Copy link
Member Author

Fixes https://github.com/dotnet/corefx/issues/24190 (which coincidentally was opened after this PR 😄)

@stephentoub stephentoub merged commit 1adc5b2 into dotnet:master Sep 22, 2017
@stephentoub stephentoub deleted the managedhandler_optin branch September 22, 2017 03:15
@MattGal
Copy link
Member

MattGal commented Sep 22, 2017

@stephentoub, as to that value being "ridiculously low for a runtime/libraries focused heavily on services", I totally agree but that's the default... we didn't set it there, we simply took macs and used them. We've done nothing to these machines to change that.

Thus, unless there's something I'm missing this is exactly what newbie OSX developers experience; if the change isn't made inside the test's execution script, then every new mac developer will have to learn this workaround before they can reliably run and pass the tests. It's one of those things like needing to open a range of firewall ports on Fedora; it makes the new-user experience not the greatest. As such, given we already have cheap and easy mechanisms in place to make the tests resilient in this way I strongly prefer this mechanism over a permanent machine change that the end user will have to just figure out via reading forum posts and experimentation. Feel free to open a core-eng issue if you feel otherwise and we can discuss it in the next triage session.

@stephentoub
Copy link
Member Author

I'd be fine if the answer is we enable this for all tests by putting 2bce06e#diff-03ce3895f0647a98768807ca9ec6382eR134 in some top-level targets file. My goal is simply that we enable this for all tests and not just piecemeal do so ad-hoc when tests randomly fail in this project or that.

@karelz karelz added this to the 2.1.0 milestone Oct 11, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…r_optin

Provide an (at least temporary) non-global ManagedHandler opt-in mechanism

Commit migrated from dotnet/corefx@1adc5b2
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants