Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error for small connection timeout #5775

Closed
wants to merge 2 commits into from

Conversation

Forgind
Copy link
Member

@Forgind Forgind commented Oct 5, 2020

Resolves AB#1224176 by giving a nice error for any small node connection timeout value.

@rainersigwald
Copy link
Member

I'd rather see the existing message changed to mention the env var if it's set. Then you could set a low value if it works (and no arbitrary cutoff), but it's easier to debug if it doesn't.

@Forgind
Copy link
Member Author

Forgind commented Oct 5, 2020

What do you mean by set a low value if it works? I see my two choices as either providing a more descriptive message:
ErrorUtilities.VerifyThrow(connected, "In-proc node failed to start up within {0}ms.{1}", connectionTimeout, connectionTimeout < 5 ? " Did you set MSBUILDNODECONNECTIONTIMEOUT to too small a value?" : string.Empty);
or trying to fix it without the user noticing:
int connectionTimeout = Math.Min(5, CommunicationsUtilities.NodeConnectionTimeout);

Are you suggesting doing both?

@rainersigwald
Copy link
Member

No, I'm saying "failed to connect in {}ms. That value is overridden from its default by the end var ..."

@@ -367,7 +367,7 @@ private bool InstantiateNode(INodePacketFactory factory)

int connectionTimeout = CommunicationsUtilities.NodeConnectionTimeout;
bool connected = _endpointConnectedEvent.WaitOne(connectionTimeout);
ErrorUtilities.VerifyThrow(connected, "In-proc node failed to start up within {0}ms", connectionTimeout);
ErrorUtilities.VerifyThrow(connected, "In-proc node failed to start up within {0}ms. That timeout is overriden from its default by the environment variable MSBUILDNODECONNECTIONTIMEOUT. Setting that to a higher value may help.", connectionTimeout);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More words in the exception is good, but I don't think this will fix the hang which is the main issue here. The build should fail, not hang.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the hang is MSBuild's fault, exactly. Building from the command line, I get:
image
even without attaching a debugger, which is pretty clear, I'd say. Is there something higher up the stack swallowing it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another vcxprojReader dump in the same feedback - for the hang. The callstacks are below:

Microsoft.Build.Execution.BuildManager.Build (on the main thread) never returns presumably waiting for InProcNode and InProcNode never gets the events it is waiting for (Microsoft.Build.BackEnd.InProcNode.Run on Thread 9 (Thread Id: 28828 (0x709c))

~*k
Callstack for Thread 1 (Thread Id: 26788 (0x68a4)):
Index Function


1 ntdll.dll!_ZwWaitForMultipleObjects@20()
2 KERNELBASE.dll!WaitForMultipleObjectsEx(unsigned long nCount, void * const * lpHandles, int bWaitAll, unsigned long dwMilliseconds, int bAlertable)
3 mscorlib.ni.dll!71d7d952()
4 [Managed to Native Transition]
5 mscorlib.dll!System.Threading.WaitHandle.InternalWaitOne(System.Runtime.InteropServices.SafeHandle waitableSafeHandle, long millisecondsTimeout, bool hasThreadAffinity, bool exitContext)
6 mscorlib.dll!System.Threading.WaitHandle.WaitOne(int millisecondsTimeout, bool exitContext)
7 mscorlib.dll!System.Threading.WaitHandle.WaitOne()
8 Microsoft.Build.dll!Microsoft.Build.Execution.BuildSubmission.Execute()
9 Microsoft.Build.dll!Microsoft.Build.Execution.BuildManager.BuildRequest(Microsoft.Build.Execution.BuildRequestData requestData)
10 Microsoft.Build.dll!Microsoft.Build.Execution.BuildManager.Build(Microsoft.Build.Execution.BuildParameters parameters, Microsoft.Build.Execution.BuildRequestData requestData)
11 VcxprojReader.exe!VcxprojReader.ProjectReader.RunDesignTimeBuild(VcxprojReader.ProjectReader.ProjectConfiguration config)
12 VcxprojReader.exe!VcxprojReader.ProjectReader.ProcessProject(string projectPath, VcxprojReader.Settings settings)
13 VcxprojReader.exe!VcxprojReader.Program.ProcessSolutionConfig(System.Collections.Generic.List projectFiles, string solutionConfig, VcxprojReader.Settings settings, System.IO.TextWriter timeLogger, System.Diagnostics.Stopwatch timer)
14 VcxprojReader.exe!VcxprojReader.Program.ProcessSolution(System.Collections.Generic.List projectFiles, VcxprojReader.Settings settings, System.IO.TextWriter timeLogger, System.Diagnostics.Stopwatch timer)
15 VcxprojReader.exe!VcxprojReader.Program.Main(string[] args)
16 [Native to Managed Transition]
17 mscorlib.ni.dll![Frames below may be incorrect and/or missing, native debugger attempting to walk managed call stack]
18 mscoreei.dll!_CorExeMain()
19 mscoree.dll!ShellShim__CorExeMain()
20 mscoree.dll!_CorExeMain_Exported()
21 ntdll.dll!__RtlUserThreadStart(long(__stdcall
)(void ) StartAddress, void * Argument)
22 ntdll.dll!_RtlUserThreadStart(long(__stdcall
)(void *) StartAddress, void * Argument)

Callstack for Thread 2 (Thread Id: 27144 (0x6a08)):
Index Function

1 ntdll.dll!_ZwWaitForMultipleObjects@20()
2 KERNELBASE.dll!WaitForMultipleObjectsEx(unsigned long nCount, void * const * lpHandles, int bWaitAll, unsigned long dwMilliseconds, int bAlertable)
3 kernel32.dll!762df989()
4 kernel32.dll![Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
5 ntdll.dll!__RtlUserThreadStart(long(__stdcall*)(void ) StartAddress, void * Argument)
6 ntdll.dll!_RtlUserThreadStart(long(__stdcall
)(void *) StartAddress, void * Argument)

Callstack for Thread 3 (Thread Id: 31296 (0x7a40)):
Index Function

1 ntdll.dll!_ZwWaitForMultipleObjects@20()
2 KERNELBASE.dll!WaitForMultipleObjectsEx(unsigned long nCount, void * const * lpHandles, int bWaitAll, unsigned long dwMilliseconds, int bAlertable)
3 kernel32.dll!762df989()
4 kernel32.dll![Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
5 ntdll.dll!__RtlUserThreadStart(long(__stdcall*)(void ) StartAddress, void * Argument)
6 ntdll.dll!_RtlUserThreadStart(long(__stdcall
)(void *) StartAddress, void * Argument)

Callstack for Thread 4 (Thread Id: 31088 (0x7970)):
Index Function

1 ntdll.dll!_ZwWaitForWorkViaWorkerFactory@20()
2 ntdll.dll!TppWorkerThread(void * Parameter)
3 kernel32.dll!762df989()
4 kernel32.dll![Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
5 ntdll.dll!__RtlUserThreadStart(long(__stdcall*)(void ) StartAddress, void * Argument)
6 ntdll.dll!_RtlUserThreadStart(long(__stdcall
)(void *) StartAddress, void * Argument)

Callstack for Thread 5 (Thread Id: 6372 (0x18e4)):
Index Function

1 ntdll.dll!_ZwWaitForWorkViaWorkerFactory@20()
2 ntdll.dll!TppWorkerThread(void * Parameter)
3 kernel32.dll!762df989()
4 kernel32.dll![Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
5 ntdll.dll!__RtlUserThreadStart(long(__stdcall*)(void ) StartAddress, void * Argument)
6 ntdll.dll!_RtlUserThreadStart(long(__stdcall
)(void *) StartAddress, void * Argument)

Callstack for Thread 6 (Thread Id: 31384 (0x7a98)):
Index Function

1 ntdll.dll!_ZwRemoveIoCompletion@20()
2 KERNELBASE.dll!GetQueuedCompletionStatus(void * CompletionPort, unsigned long * lpNumberOfBytesTransferred, unsigned long * lpCompletionKey, _OVERLAPPED * * lpOverlapped, unsigned long dwMilliseconds)
3 kernel32.dll!762df989()
4 kernel32.dll![Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
5 ntdll.dll!__RtlUserThreadStart(long(__stdcall*)(void ) StartAddress, void * Argument)
6 ntdll.dll!_RtlUserThreadStart(long(__stdcall
)(void *) StartAddress, void * Argument)

Callstack for Thread 7 (Thread Id: 10916 (0x2aa4)):
Index Function

1 ntdll.dll!_ZwWaitForWorkViaWorkerFactory@20()
2 ntdll.dll!TppWorkerThread(void * Parameter)
3 kernel32.dll!762df989()
4 kernel32.dll![Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
5 ntdll.dll!__RtlUserThreadStart(long(__stdcall*)(void ) StartAddress, void * Argument)
6 ntdll.dll!_RtlUserThreadStart(long(__stdcall
)(void *) StartAddress, void * Argument)

Callstack for Thread 8 (Thread Id: 27016 (0x6988)):
Index Function

1 ntdll.dll!_ZwWaitForWorkViaWorkerFactory@20()
2 ntdll.dll!TppWorkerThread(void * Parameter)
3 kernel32.dll!762df989()
4 kernel32.dll![Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
5 ntdll.dll!__RtlUserThreadStart(long(__stdcall*)(void ) StartAddress, void * Argument)
6 ntdll.dll!_RtlUserThreadStart(long(__stdcall
)(void *) StartAddress, void * Argument)

Callstack for Thread 9 (Thread Id: 28828 (0x709c)):
Index Function

1 ntdll.dll!_ZwWaitForMultipleObjects@20()
2 KERNELBASE.dll!WaitForMultipleObjectsEx(unsigned long nCount, void * const * lpHandles, int bWaitAll, unsigned long dwMilliseconds, int bAlertable)
3 mscorlib.ni.dll!71e08a86()
4 [Managed to Native Transition]
5 mscorlib.dll!System.Threading.WaitHandle.WaitAny(System.Threading.WaitHandle[] waitHandles, int millisecondsTimeout, bool exitContext)
6 Microsoft.Build.dll!Microsoft.Build.BackEnd.InProcNode.Run(out System.Exception shutdownException)
7 Microsoft.Build.dll!Microsoft.Build.BackEnd.NodeProviderInProc.InProcNodeThreadProc()
8 mscorlib.dll!System.Threading.ThreadHelper.ThreadStart_Context(object state)
9 mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx)
10 mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx)
11 mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state)
12 mscorlib.dll!System.Threading.ThreadHelper.ThreadStart()
13 [Native to Managed Transition]
14 mscorlib.ni.dll![Frames below may be incorrect and/or missing, native debugger attempting to walk managed call stack]
15 kernel32.dll!762df989()
16 ntdll.dll!__RtlUserThreadStart(long(__stdcall*)(void ) StartAddress, void * Argument)
17 ntdll.dll!_RtlUserThreadStart(long(__stdcall
)(void *) StartAddress, void * Argument)

Callstack for Thread 10 (Thread Id: 4660 (0x1234)):
Index Function

1 ntdll.dll!_ZwWaitForSingleObject@12()
2 KERNELBASE.dll!WaitForSingleObjectEx(void * hHandle, unsigned long dwMilliseconds, int bAlertable)
3 KERNELBASE.dll!WaitForSingleObject(void * hHandle, unsigned long dwMilliseconds)
4 kernel32.dll!762df989()
5 kernel32.dll![Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
6 ntdll.dll!__RtlUserThreadStart(long(__stdcall*)(void ) StartAddress, void * Argument)
7 ntdll.dll!_RtlUserThreadStart(long(__stdcall
)(void *) StartAddress, void * Argument)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand where the problem arises because it failed very visibly from the command line, but it just hangs in VS. Is it that VS is trying to make its version of the error box I saw, but since it isn't visible, I don't have the opportunity to see it or react to it, and everything is frozen until I do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Forgind that dialog only happens for debug builds. Switch to Release and you might get closer to the other behavior

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is easier to repro the problem with vcxprojreader.exe (you can run it from Dev command prompt)
vcxprojReader /Solution:[full path to a solution containing Windows Desktop vcxproj] /debug

/debug will make it to pause so you can attach the debugger.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a surface level, it gets swallowed here:
https://github.com/dotnet/msbuild/blob/master/src/Build/BackEnd/BuildManager/BuildManager.cs#L1216-L1221
It explicitly says we can't just rethrow there, and rethrowing just moves it here:
https://github.com/dotnet/msbuild/blob/master/src/Build/BackEnd/BuildManager/BuildManager.cs#L1236-L1241

Rethrowing in both places takes us out of MSBuild, so I had trouble chasing it. It next gets caught here:
https://github.com/dotnet/runtime/blob/789845faadfb0153f624a04a592e52c5a61a7d02/src/libraries/System.Threading.Tasks.Dataflow/src/Internal/TargetCore.cs#L483-L487

Moving a little past that, my call stack looks like:

>	mscorlib.dll!System.Threading.Tasks.Task.Execute() Line 2498	C#
 	mscorlib.dll!System.Threading.Tasks.Task.ExecutionContextCallback(object obj) Line 2861	C#
 	mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) Line 980	C#
 	mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) Line 928	C#
 	mscorlib.dll!System.Threading.Tasks.Task.ExecuteWithThreadLocal(ref System.Threading.Tasks.Task currentTaskSlot) Line 2827	C#
 	mscorlib.dll!System.Threading.Tasks.Task.ExecuteEntry(bool bPreventDoubleExecution) Line 2767	C#
 	mscorlib.dll!System.Threading.Tasks.Task.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() Line 2704	C#
 	mscorlib.dll!System.Threading.ThreadPoolWorkQueue.Dispatch() Line 820	C#
 	mscorlib.dll!System.Threading._ThreadPoolWaitCallback.PerformWaitCallback() Line 1161	C#

Continuing on, the call stack stops changing here:

>	mscorlib.dll!System.Threading.ThreadPoolWorkQueue.Dispatch() Line 858	C#
 	mscorlib.dll!System.Threading._ThreadPoolWaitCallback.PerformWaitCallback() Line 1161	C#

Any thoughts? I don't know how to make something visible to the user after that.

@@ -367,7 +367,7 @@ private bool InstantiateNode(INodePacketFactory factory)

int connectionTimeout = CommunicationsUtilities.NodeConnectionTimeout;
bool connected = _endpointConnectedEvent.WaitOne(connectionTimeout);
ErrorUtilities.VerifyThrow(connected, "In-proc node failed to start up within {0}ms", connectionTimeout);
ErrorUtilities.VerifyThrow(connected, "In-proc node failed to start up within {0}ms. That timeout is overriden from its default by the environment variable MSBUILDNODECONNECTIONTIMEOUT. Setting that to a higher value may help.", connectionTimeout);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: overriden -> overridden

Copy link
Member

@rainersigwald rainersigwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is worth doing, and I agree with Olga that the hang is more concerning. But if we can cheaply identify the specific error we got reported to us I think that makes sense.

@@ -366,6 +366,7 @@ private bool InstantiateNode(INodePacketFactory factory)
_inProcNodeEndpoint.Connect(this);

int connectionTimeout = CommunicationsUtilities.NodeConnectionTimeout;
ErrorUtilities.VerifyThrowArgument(connectionTimeout > 5, "InvalidConnectionTimeout", connectionTimeout);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you pick 5ms?

@@ -367,7 +367,7 @@ private bool InstantiateNode(INodePacketFactory factory)

int connectionTimeout = CommunicationsUtilities.NodeConnectionTimeout;
bool connected = _endpointConnectedEvent.WaitOne(connectionTimeout);
ErrorUtilities.VerifyThrow(connected, "In-proc node failed to start up within {0}ms", connectionTimeout);
ErrorUtilities.VerifyThrow(connected, "In-proc node failed to start up within {0}ms. That timeout is overriden from its default by the environment variable MSBUILDNODECONNECTIONTIMEOUT. Setting that to a higher value may help.", connectionTimeout);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Forgind that dialog only happens for debug builds. Switch to Release and you might get closer to the other behavior

@@ -367,7 +367,7 @@ private bool InstantiateNode(INodePacketFactory factory)

int connectionTimeout = CommunicationsUtilities.NodeConnectionTimeout;
bool connected = _endpointConnectedEvent.WaitOne(connectionTimeout);
ErrorUtilities.VerifyThrow(connected, "In-proc node failed to start up within {0}ms", connectionTimeout);
ErrorUtilities.VerifyThrow(connected, "In-proc node failed to start up within {0}ms. That timeout is overriden from its default by the environment variable MSBUILDNODECONNECTIONTIMEOUT. Setting that to a higher value may help.", connectionTimeout);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you distinguish between the default and the overridden case? Now it will fire if the (normal) timeout is too short.

@Forgind
Copy link
Member Author

Forgind commented Nov 30, 2020

Closing in favor of #5917

@Forgind Forgind closed this Nov 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants