Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IIS InProcess-hosting not restarting the process after unhandled exception occurred #22507

Closed
Compufreak345 opened this issue Jun 3, 2020 · 8 comments
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions
Milestone

Comments

@Compufreak345
Copy link

Compufreak345 commented Jun 3, 2020

Describe the bug

In .Net Core 2.x with CaptureStartupErrors(false) as well as in the full framework if my application throws an unhandled exception on Startup IIS will attempt to start it again on the next request.

In .Net Core 3.1 (hosted InProcess) the application stops to respond and does not start again, no matter what I set for CaptureStartupErrors (false should be default anyways).

With the following boilerplate code the application restarts as expected :

public static void Main(string[] args)
{
    try
    {
        CreateWebHostBuilder(args).Build().Run();
    }
    catch (Exception)
    {
        Environment.Exit(-1);
    }
}

To Reproduce

If you publish this project to an IIS (I used folder publish and converted the folder into an application in IIS) you can see the behavior:
https://github.com/Compufreak345/ExampleNetCore31App

Send a get-request to the published website. You will see an error page and a single log-entry "Starting" in log.txt, no matter how many requests you send.

If you uncomment the code in "Program.cs" catching the exception, you will see multiple "Starting"-Entries - one for each request (until IIS reaches its limit of retries). This is the behavior I expected without this code as well.

Exceptions (if any)

The following errorpage gets displayed:

HTTP Error 500.30 - ANCM In-Process Start Failure

Common solutions to this issue:

  • The application failed to start
  • The application started but then stopped
  • The application started but threw an exception during startup

Troubleshooting steps:

  • Check the system event log for error messages
  • Enable logging the application process' stdout messages
  • Attach a debugger to the application process and inspect

For more information visit: https://go.microsoft.com/fwlink/?LinkID=2028265

Further technical details

  • ASP.NET Core version 3.1
  • Include the output of dotnet --info
 dotnet --info
.NET Core SDK (gemäß "global.json"):
 Version:   3.1.300
 Commit:    b2475c1295

Laufzeitumgebung:
 OS Name:     Windows
 OS Version:  10.0.18363
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   C:\Program Files\dotnet\sdk\3.1.300\

Host (useful for support):
  Version: 3.1.4
  Commit:  0c2e69caa6

.NET Core SDKs installed:
  2.1.201 [C:\Program Files\dotnet\sdk]
  2.1.202 [C:\Program Files\dotnet\sdk]
  2.1.302 [C:\Program Files\dotnet\sdk]
  2.1.400 [C:\Program Files\dotnet\sdk]
  2.1.401 [C:\Program Files\dotnet\sdk]
  2.1.402 [C:\Program Files\dotnet\sdk]
  2.1.403 [C:\Program Files\dotnet\sdk]
  2.1.500-preview-009404 [C:\Program Files\dotnet\sdk]
  2.1.500 [C:\Program Files\dotnet\sdk]
  2.1.502 [C:\Program Files\dotnet\sdk]
  2.1.503 [C:\Program Files\dotnet\sdk]
  2.1.504 [C:\Program Files\dotnet\sdk]
  2.1.505 [C:\Program Files\dotnet\sdk]
  2.1.509 [C:\Program Files\dotnet\sdk]
  2.1.511 [C:\Program Files\dotnet\sdk]
  2.1.512 [C:\Program Files\dotnet\sdk]
  2.1.514 [C:\Program Files\dotnet\sdk]
  2.1.602 [C:\Program Files\dotnet\sdk]
  2.1.701 [C:\Program Files\dotnet\sdk]
  2.1.801 [C:\Program Files\dotnet\sdk]
  2.2.101 [C:\Program Files\dotnet\sdk]
  2.2.102 [C:\Program Files\dotnet\sdk]
  2.2.202 [C:\Program Files\dotnet\sdk]
  2.2.301 [C:\Program Files\dotnet\sdk]
  2.2.401 [C:\Program Files\dotnet\sdk]
  3.1.100 [C:\Program Files\dotnet\sdk]
  3.1.300 [C:\Program Files\dotnet\sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.4 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.5 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.6 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.7 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.8 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.9 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.12 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.13 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.15 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.16 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.18 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.1 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.3 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.6 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.8 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.4 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.5 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.6 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.7 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.8 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.9 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.12 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.13 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.15 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.16 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.18 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.1 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.3 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.6 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.8 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.1.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.1.4 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.0.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.0.9 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.2 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.3-servicing-26724-03 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.4 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.5 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.6 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.8 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.9 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.12 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.13 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.15 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.16 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.18 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.1 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.3 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.6 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.8 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.1.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.1.4 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.WindowsDesktop.App 3.1.0 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 3.1.4 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download
  • The IDE (VS / VS Code/ VS4Mac) you're running on, and it's version
    Visual Studio Enterprise 2019 16.6.0
@davidfowl
Copy link
Member

cc @jkotalik

@jkotalik jkotalik added this to the Next sprint planning milestone Jun 3, 2020
@jkotalik jkotalik self-assigned this Jun 3, 2020
@jkotalik
Copy link
Contributor

jkotalik commented Jun 5, 2020

I believe this behavior is expected and is a behavior difference between IIS Out-of-process and In-Process. However, we could consider trying to change the behavior for in-process.

In 2.2, it sounds like you were using IIS out-of-process, as you said you saw this behavior with Full Framework. ANCM will constantly try to restart the dotnet process if it crashes.

However, in 3.1/in-process, if the application crashes on Startup, ANCM will not restart the process if it crashes on Startup. This is for a few reasons:

  • Because we are running in-process, we need to restart the w3wp.exe/iisexpress.exe process entirely as we can't start the dotnet runtime twice without possibility for bad behavior.
  • Constantly restarting the w3wp process is usually not a good idea.
  • We made an assumption that if a process throws an unhandled exception on startup, we should required the app to be redeployed before trying again. This may not have been the right decision at the time, but it is a key reason for the behavior change.

AFAIK, the code you have to call Environment.Exit(-1) may work for a few tries, however I believe that after calling that a few times, IIS will trigger Rapid Failure protection which will force the site not to start again.

Few questions:

  • Is there a reason why the app is throwing on startup?
  • If you try catch without doing Environment.Exit(-1), what happens?

@davidfowl
Copy link
Member

We made an assumption that if a process throws an unhandled exception on startup, we should required the app to be redeployed before trying again. This may not have been the right decision at the time, but it is a key reason for the behavior change.

I'd like to understand the idea of a transient startup error. It's not something we've come across before so it'd be understand what kind of application fails on startup then continues to work.

@Compufreak345
Copy link
Author

Thanks for the explanation of the behavior. It was a bit surprising for us to see this change in behavior.

So let me answer your questions:

  • Is there a reason why the app is throwing on startup? (This should answer the question from @davidfowl as well)
    Yes, we have a StartupFilter that subscribes to a redis-channel on startup as this is needed for some cache invalidation logic. Our Operations-Team reported that our service failed to start after 11 out of 100 system reboots. The reason is that we get a SocketFailure (ReadSocketError/ConnectionReset) on this initial subscription.
    On our classic .NET-Services doing the same logic the problem solves itself as the application restarts automatically as described. Rapid Failure protection is not an issue as it works on the 2nd or 3rd try.
    I'm quite sure there is something network-related involved that's not completely initialised after a fresh system boot. This error is kind of hard to find for our operations team, an automatic restart of the application would solve / work around this for now. Adding retry-logic to the connection initialisation would be another option, but if the IIS already provides such logic we don't want to increase the complexity of our code for a feature that's already there.

  • If you try catch without doing Environment.Exit(-1), what happens?
    The same as if I do not catch. The application stops responding and IIS does not attempt to restart.

@jkotalik
Copy link
Contributor

jkotalik commented Jun 8, 2020

. Adding retry-logic to the connection initialisation would be another option, but if the IIS already provides such logic we don't want to increase the complexity of our code for a feature that's already there.

To me, it sounds like the retries should be centralized to retrying the subscription to the redis channel. I understand that restarting the process may "work", however it seems excessive to restart the process in this scenario.

@jkotalik jkotalik removed their assignment Jun 9, 2020
@TylerReid
Copy link

This has also been an issue for me. The problem I have is that we use Consul as a service discovery and configuration source, but when our ec2 comes up asp.net can beat the consul agent to start (a nice problem to have btw).

I agree @jkotalik that probably the right thing to do is retry in the source of the problem, but it would also be nice to have a built in option to restart on failure in case something else happens, maybe with a param for how long to wait to not trigger the rapid failure protection.

@davidfowl for this:

I'd like to understand the idea of a transient startup error.

For us this is caused by network requests in ConfigureServices which also necessitates the use of .GetAwaiter().GetResult() which is bad. Is there any guidance on how best to get configuration data over the network during startup, or is this not a directly supported pattern? In our use case with consul we get very fast configuration updates across all api instances for things like feature flags, so the benefits of bad async in startup and very rare problems with startup races are worth it.

@ghost
Copy link

ghost commented Nov 12, 2020

Thank you for contacting us. Due to a lack of activity on this discussion issue we're closing it in an effort to keep our backlog clean. If you believe there is a concern related to the ASP.NET Core framework, which hasn't been addressed yet, please file a new issue.

This issue will be locked after 30 more days of inactivity. If you still wish to discuss this subject after then, please create a new issue!

@ghost ghost closed this as completed Nov 12, 2020
@kkapuscinski
Copy link

kkapuscinski commented Nov 19, 2020

@jkotalik @davidfowl
we had recently problem in OutOfProcess 2.1 ASP.NET Core and we wanted to move to 3.1 InProcess but this behaviour is undesirable. We have 100+ applications and we are doing Application Pool recycle (with overlapping) around 03:00 for all those applications and it led us to OutOfMemoryException (which was transient problem because on first request after minute there was plenty of memory) in many processes during startup.

Web Aplications written in .Net Framework handled this by restarting AppDomain but .Net Core "hanged". For now our solution is to set CaptureStartupErrors(false) or move to 3.1 InProcess using suggested workaround.

Rapid Fail Protection in 2.1 as i tested is not an issue because dotnet.exe fails not w3wp
Rapid Fail Protection in 3.1 with InProcess after many unsuccesful restart can be desirable cause Load Balancing can automatically detect when he cant reach server with turned off Application Pool and route request to different server

Returning to .Net Framework i found that if exception happens during startup there is a timeout around 10 seconds and after that time AppDomain is torn down. Any request during this time recives "cached" error. So maybe simillar solution would be applicable

@ghost ghost locked as resolved and limited conversation to collaborators Dec 27, 2020
@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Aug 24, 2023
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions
Projects
None yet
Development

No branches or pull requests

7 participants