Undetermined runtime of tests due to large non-matching datasets #113

FuegoArtificial · 2015-08-28T11:39:46Z

Hi Seddryck,

I would like to report an issue where NBi can run unlimited time (6+ hours for one test). It might be a NUnit issue though.

If a dataset with columns like key (text), value (text), value (text), value (text), value (text) contains 500,000 rows is compared to a same-structured dataset with 500,000 rows where no row results in a match, then the test can run infinitely.

I know that this kind of test is far away from best practice and does not make sense. However during the development of a test this occured. The test suite run should not "crash". Due to a missing result after the "crash" it can be hard to figure out which test triggers the issue. No notification is possible.

Can you reproduce the issue or do you need more information?

Ideas:

setting a time out limit for tests (in the config?)
optimizing the runtime when comparing the two datasets. In the results the first 10 different rows are shown. As soon as this goal is reached, is there a way just to increment the counters as soon as a difference is detected? So no further logic for comparison needs to be applied in that row because it's not shown anywhere and just taken for the number-of-different-rows-counter.

Both ideas probably aren't ideal. Perhaps there is a better workaround or approach? :)

Have a great day!
Tilo

Seddryck · 2015-08-28T13:44:21Z

Interesting case, thx to report it. I knew that NBi was fast to compare sets with around 36k ... but indeed I've never tested the case where none of them would have a match.

About your option 2, no additional task than asserting the comparison is done for all the rows. And display shouldn't really use more than the needed rows (You can override the limit of 10) so I can't really optimize this part (at a first look).

Implementing the timeout was planned but unfortunately NUnit 2.X doesn't expose this parameter in its "API". So I'd need to implement my own timeout and it could be tricky. I'll check how I can manage this but will probably need time to check the root cause of this issue and find a work-around.

Seddryck · 2015-09-14T20:06:29Z

Hi FuegoArtificial, I've tried to reproduce but can't (at the moment). Comparing two datasets with 1.000.000 rows takes 5 seconds on my machine. But I'm not sure I've really understood your case.

Do the rows in the two datasets have matching keys or not?

FuegoArtificial · 2015-09-15T11:45:09Z

I'll try to get an answer with specifics from my colleague next week. Sorry for the delay!

Seddryck · 2015-09-15T11:59:27Z

I think I got it, but I need further investigations.

Seddryck · 2015-09-17T21:49:04Z

Well the problem is when you need to compare values of a large set of data and is not dependant of the count of differences. So basically all I cound do is put a timeout on the test and for this I'll wait NUnit 3.0

Anyway, there is room for improving the performances of the method (caching keys, avoiding usage of contain when not needed). I'll try to work on this.

FuegoArtificial · 2015-09-18T12:00:48Z

That's great!
We have to quickly test the tests during development anyways, so this would be a very rare issue :)
Thanks for looking into it!

Seddryck · 2015-10-29T16:05:48Z

Just to be clear, it's not part of release v1.11 but I've already worked on this and had nice improvement. It's included in the beta for v1.12 (with all the features of v1.11). https://github.com/Seddryck/NBi/releases/v1.12-beta

Seddryck added started feature-request labels Sep 14, 2015

Seddryck self-assigned this Sep 14, 2015

Seddryck added this to the 1.12 milestone Sep 18, 2015

Seddryck closed this as completed Oct 29, 2015

Seddryck added preview-available and removed started labels Oct 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Undetermined runtime of tests due to large non-matching datasets #113

Undetermined runtime of tests due to large non-matching datasets #113

FuegoArtificial commented Aug 28, 2015

Seddryck commented Aug 28, 2015

Seddryck commented Sep 14, 2015

FuegoArtificial commented Sep 15, 2015

Seddryck commented Sep 15, 2015

Seddryck commented Sep 17, 2015

FuegoArtificial commented Sep 18, 2015

Seddryck commented Oct 29, 2015

Undetermined runtime of tests due to large non-matching datasets #113

Undetermined runtime of tests due to large non-matching datasets #113

Comments

FuegoArtificial commented Aug 28, 2015

Seddryck commented Aug 28, 2015

Seddryck commented Sep 14, 2015

FuegoArtificial commented Sep 15, 2015

Seddryck commented Sep 15, 2015

Seddryck commented Sep 17, 2015

FuegoArtificial commented Sep 18, 2015

Seddryck commented Oct 29, 2015