How to detect if test crashed from stack overflow by examining console runner output? #266

MikeTheGreat · 2018-05-08T23:45:46Z

Hello!

Let's say that I've got a method that I'm testing, and it's got an infinite recursion problem in it. I've got an NUnit test which calls this method. The method then crashes, and takes down NUnit with it (since user code can't catch StackOverflowExceptions in 2.0 and later).

The console runner doesn't seem to be able to detect that the test process crashed from the stack overflow (which is reasonable) and instead writes a single result to the output XML saying "An existing connection was forcibly closed by the remote host" (i.e., the process running NUnit disappeared).

I'd like to write some code that can programatically detect this. It'll have access to the XML file (I assume I can get the return value from the console runner's command line invocation, too, if that's helpful).

It would be nice to diagnose the problem down to the 'infinite recursion' level of detail, but even being able to determine that some tested code caused a problem so bad that the CLR crashed is fine. I'm testing code written by my students and can reasonably expect that a catastrophic crash is being caused by recursion (since it's what we're covering now) :)

If y'all don't mind my asking, what's a good way to detect that NUnit has crashed (given the results.xml file)?

What I'm looking for is something to look for / check in the file that will reliably indicate an error like this. Some possiblities:

Is it enough to check for the string that indicates the client died ( "An existing connection...")?
When it crashes NUnit sets the 'testcasecount' to 1 (on the test-run element). I know that I'm running more than 1 test - would it be sufficient to check for
Within the test-suite XML element there's an attribute named "runstate" that's set to NotRunnable - is that a good indicator?
Would it be better to check for several things?

(Also: I haven't posted this on StackOverflow.com yet, but I'd be happy to if that's a better way to get help - I greatly appreciate all your help and want to be respectful of your time!)

CharliePoole · 2018-05-09T04:41:11Z

Your objective is a good one but I think you have the wrong end of the stick in trying to resolve it by looking at the output created on the console runner. The console runner gets a top-element-only XML result from the engine because there is nothing better available. So any fields in that result were set without information to go on and trying to make them mean something more specific is futile.

You have to look closer to the source in order to detect what causes the exception itself. Then you would try to figure out how to leave behind information for the runner.

In this case, the source would be the code in the framework that calls the method that overflows. In theory, it could be any of several cases where user code is invoked, but starting with the invocation of the test method probably makes sense.

ChrisMaddock · 2018-05-09T08:10:42Z

I do like this idea! A lot like something I posted at: nunit/nunit-console#391

So. It's possible to detect the agent crash, definitely, and I believe StackOverflowException exists with a specific error code as detailed in the above issue, so it's theoretically possible to detect when a test agent has crashed specifically on Stack Overflow.

My plan was just to be lazy, and print that to console. The 'nice' solution however would be to update the testresult xml that's written, as you say. It should be possible to work out which test assembly was running on the agent that crashed, and modify the xml accordingly - @CharliePoole may be better able to advise on the architecture there.

Your points:

Is it enough to check for the string that indicates the client died ( "An existing connection...")?

Not to ensure StackOverflow - this crash can also be caused by other reasons. It would be good to handle 'agent crashed for unknown reasons' in the same way, however.

When it crashes NUnit sets the 'testcasecount' to 1 (on the test-run element). I know that I'm running more than 1 test - would it be sufficient to check for

I'm not sure you finished this question! But I think it relates to Terminated Agent Process leads to lost Test Results nunit-console#413 - some of the background there may explain why this is.

Within the test-suite XML element there's an attribute named "runstate" that's set to NotRunnable - is that a good indicator?
Would it be better to check for several things?

Personally - if it's stack overflow specifically you'd want to track - I think it's best to look at the specific agent exit code, from inside the NUnit console source. I wrote a little about how we can currently handle different exit codes in the above issue - let us know how it works out! 😄

MikeTheGreat · 2018-05-09T22:55:36Z

Huh - so the OS (yes?) notices that console runner exited from a stack overflow, and then uses -1073741571 as the command-line return value?
I had assumed that since user code couldn't catch it then it wasn't going to be directly detectable (i.e., uncatchable means we could only infer that the console runner had crashed catastrophically)

I'm a bit strapped for time right now, but if I can find some time in the future I'll see about looking into this.
Thanks!
--Mike

ChrisMaddock · 2018-05-10T07:18:13Z

I think that’s the case - please check my working however!

MikeTheGreat changed the title ~~How to detect if test crashed from stack overflow by examining console runner output? is:question~~ How to detect if test crashed from stack overflow by examining console runner output? May 8, 2018

MikeTheGreat closed this as completed May 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to detect if test crashed from stack overflow by examining console runner output? #266

How to detect if test crashed from stack overflow by examining console runner output? #266

MikeTheGreat commented May 8, 2018

CharliePoole commented May 9, 2018

ChrisMaddock commented May 9, 2018 •

edited

Loading

MikeTheGreat commented May 9, 2018

ChrisMaddock commented May 10, 2018

How to detect if test crashed from stack overflow by examining console runner output? #266

How to detect if test crashed from stack overflow by examining console runner output? #266

Comments

MikeTheGreat commented May 8, 2018

CharliePoole commented May 9, 2018

ChrisMaddock commented May 9, 2018 • edited Loading

MikeTheGreat commented May 9, 2018

ChrisMaddock commented May 10, 2018

ChrisMaddock commented May 9, 2018 •

edited

Loading