Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BackgroundService failed when starting dotnet-monitor in AKS environment #2118

Closed
bcarthic opened this issue Jul 13, 2022 · 2 comments
Closed
Labels
bug Something isn't working

Comments

@bcarthic
Copy link

bcarthic commented Jul 13, 2022

Description

We are hosting dotnet-monitor as a side car container in our pod, last 2 days we see huge spike of BackgroundService being failed and the container restarted many times which caused high CPU and this got mitigated and the monitor started fine. But trying to understand the root cause of this failure which caused the spike in CPU.

{
    "Timestamp": "2022-07-11T07:09:34.8375370Z",
    "EventId": 9,
    "LogLevel": "Error",
    "Category": "Microsoft.Extensions.Hosting.Internal.Host",
    "Message": "BackgroundService failed",
    "Exception": "System.Threading.Tasks.TaskCanceledException: A task was canceled. at Microsoft.Diagnostics.Monitoring.WebApi.MetricsService.ExecuteAsync(CancellationToken stoppingToken) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Metrics/MetricsService.cs:line 71    at Microsoft.Extensions.Hosting.Internal.Host.TryExecuteBackgroundServiceAsync(BackgroundService backgroundService)",
    "State": { "Message": "BackgroundService failed", "{OriginalFormat}": "BackgroundService failed" },
    "Scopes": []
}

{
    "Timestamp": "2022-07-11T07:09:34.8387837Z",
    "EventId": 10,
    "LogLevel": "Critical",
    "Category": "Microsoft.Extensions.Hosting.Internal.Host",
    "Message": "The HostOptions.BackgroundServiceExceptionBehavior is configured to StopHost. A BackgroundService has thrown an unhandled exception, and the IHost instance is stopping. To avoid this behavior, configure this to Ignore; however the BackgroundService will not be restarted.",
    "Exception": "System.Threading.Tasks.TaskCanceledException: A task was canceled.   at Microsoft.Diagnostics.Monitoring.WebApi.MetricsService.ExecuteAsync(CancellationToken stoppingToken) in /_/src/Microsoft.Diagnostics.Monitoring.WebApi/Metrics/MetricsService.cs:line 71    at Microsoft.Extensions.Hosting.Internal.Host.TryExecuteBackgroundServiceAsync(BackgroundService backgroundService)",
    "State": {
        "Message": "The HostOptions.BackgroundServiceExceptionBehavior is configured to StopHost. A BackgroundService has thrown an unhandled exception, and the IHost instance is stopping. To avoid this behavior, configure this to Ignore; however the BackgroundService will not be restarted.",
        "{OriginalFormat}": "The HostOptions.BackgroundServiceExceptionBehavior is configured to StopHost. A BackgroundService has thrown an unhandled exception, and the IHost instance is stopping. To avoid this behavior, configure this to Ignore; however the BackgroundService will not be restarted."
    },
    "Scopes": []
}
Unhandled exception: System.AggregateException: One or more errors occurred. (Address in use)
 ---> System.Net.Sockets.SocketException (98): Address in use
   at System.Net.Sockets.Socket.UpdateStatusAfterSocketErrorAndThrowException(SocketError error, String callerName)
   at System.Net.Sockets.Socket.DoBind(EndPoint endPointSnapshot, SocketAddress socketAddress)
   at System.Net.Sockets.Socket.Bind(EndPoint localEP)
   at Microsoft.Diagnostics.NETCore.Client.IpcUnixDomainSocket.Bind(IpcUnixDomainSocketEndPoint localEP)
   at Microsoft.Diagnostics.NETCore.Client.IpcUnixDomainSocketServerTransport.CreateNewSocketServer()
   at Microsoft.Diagnostics.NETCore.Client.IpcUnixDomainSocketServerTransport..ctor(String path, Int32 backlog, IIpcServerTransportCallbackInternal transportCallback)
   at Microsoft.Diagnostics.NETCore.Client.IpcServerTransport.Create(String address, Int32 maxConnections, Boolean enableTcpIpProtocol, IIpcServerTransportCallbackInternal transportCallback)
   at Microsoft.Diagnostics.NETCore.Client.ReversedDiagnosticsServer.ListenAsync(Int32 maxConnections, CancellationToken token)
   at Microsoft.Diagnostics.NETCore.Client.ReversedDiagnosticsServer.DisposeAsync()
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Threading.Tasks.Task.Wait()
   at Microsoft.Diagnostics.NETCore.Client.ReversedDiagnosticsServer.Start(Int32 maxConnections)
   at Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource.ExecuteAsync(CancellationToken stoppingToken) in /_/src/Tools/dotnet-monitor/EndpointInfo/ServerEndpointInfoSource.cs:line 87
   at Microsoft.Diagnostics.Tools.Monitor.ServerEndpointInfoSource.ExecuteAsync(CancellationToken stoppingToken) in /_/src/Tools/dotnet-monitor/EndpointInfo/ServerEndpointInfoSource.cs:line 89
   at Microsoft.Extensions.Hosting.Internal.Host.StartAsync(CancellationToken cancellationToken)
   at Microsoft.Diagnostics.Tools.Monitor.Commands.CollectCommandHandler.Invoke(CancellationToken token, String[] urls, String[] metricUrls, Boolean metrics, String diagnosticPort, Boolean noAuth, Boolean tempApiKey, Boolean noHttpEgress) in /_/src/Tools/dotnet-monitor/Commands/CollectCommandHandler.cs:line 35
   at Microsoft.Diagnostics.Tools.Monitor.Commands.CollectCommandHandler.Invoke(CancellationToken token, String[] urls, String[] metricUrls, Boolean metrics, String diagnosticPort, Boolean noAuth, Boolean tempApiKey, Boolean noHttpEgress) in /_/src/Tools/dotnet-monitor/Commands/CollectCommandHandler.cs:line 66
   at System.CommandLine.Invocation.CommandHandler.GetResultCodeAsync(Object value, InvocationContext context)
   at System.CommandLine.Invocation.ModelBindingCommandHandler.InvokeAsync(InvocationContext context)
   at System.CommandLine.Invocation.InvocationPipeline.<>c__DisplayClass4_0.<<BuildInvocationChain>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<UseParseErrorReporting>b__21_0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass16_0.<<UseHelp>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass25_0.<<UseVersionOption>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass23_0.<<UseTypoCorrections>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<UseSuggestDirective>b__22_0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<UseParseDirective>b__20_0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<UseDebugDirective>b__11_0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<RegisterWithDotnetSuggest>b__10_0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass14_0.<<UseExceptionHandler>b__0>d.MoveNext()

Configuration

  • Is this related to a specific tool? No
  • What OS and version, and what distro if applicable? Linux, cbl-mariner2.0-amd64
  • Are you running in any particular type of environment? (e.g. Containers, a cloud scenario, app you are trying to target is a different user) Containers
  • dotnet-monitor image : mcr.microsoft.com/dotnet/monitor:6@sha256:910e67242bbf9a8918a86c202149f81b92fd3ba26c1cef0c7bc29cf4c3683f52

Regression?

  • Did this work in a previous build or release - either of the tool or of the .NET runtime being used? We didn't change the the image, but this happened last 2 days and mitigated automatically
@bcarthic bcarthic added the bug Something isn't working label Jul 13, 2022
@wiktork
Copy link
Member

wiktork commented Jul 13, 2022

Likely #1827

@jander-msft
Copy link
Member

jander-msft commented Aug 9, 2022

The "Address is use" issue should be fixed in the dotnet-monitor images that were released today: 6.2.2 and 7.0.0 Preview 7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants