Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing flaky Health_Connect test #354

Merged
merged 13 commits into from
Jul 19, 2024
Merged

Conversation

marcin-krystianc
Copy link
Contributor

Health_Connect failed from time to time,
It seems that the "connect" related information is not immediately updated. Hence we need to refer to blocking queries (i.e. set the waitindex >0).

@IvanKolchanov
Copy link
Contributor

IvanKolchanov commented Jul 18, 2024

@marcin-krystianc
Everything looks great, I just have two questions/suggestions.
First, the original Go code was setting up the service through a proxy and checking that only the proxy will show up in the result and that the port will be correct, which is not exactly crucial, and I am not completely sure what their intention was, but might be a thing we want to do/test too, as they probably put it in for a reason. I had the same approach when doing Ingress() method.
https://github.com/hashicorp/consul/blob/v1.9.17/api/health_test.go#L488

Second, couldn't the while loop theoretically be running forever if server gets silly for some reason? Should we put some kind of fail check for that, a timeout or something. Or maybe just not use a while loop, but make a normal (not blocking query request) and if it fails just make another one, but a blocking query, using the LastIndex we got. That should probably be enough, and if the information is not coming in after 5 minutes of default blocking query, then something is definetely wrong.

I might be misunderstanding/missing something there, feel free to correct me! :)

@marcin-krystianc
Copy link
Contributor Author

@IvanKolchanov
Thanks for taking a look.
I've updated the test assertions, so we also check that there is only a single response and that the port number is correct.

Regarding the possibility of falling into an infinite loop, I don't think it is possible with the current code.
We either make progress (wait index increments Assert.True(checks.LastIndex > q.WaitIndex);) or we fail due to a timeout (new CancellationTokenSource(TimeSpan.FromMinutes(1))).

With your suggestion:

Or maybe just not use a while loop, but make a normal (not blocking query request), and if it fails just make another one, but a blocking query, using the LastIndex we got.

We could still have a flaky test, as it is very unlikely (but still possible) that the other tests running concurrently will change the state of the consul server, and the blocking query will return, although the information we are interested in is still not there.
Hence repeating the blocking query in a while loop seems necessary.

@IvanKolchanov
Copy link
Contributor

@IvanKolchanov Thanks for taking a look. I've updated the test assertions, so we also check that there is only a single response and that the port number is correct.

Regarding the possibility of falling into an infinite loop, I don't think it is possible with the current code. We either make progress (wait index increments Assert.True(checks.LastIndex > q.WaitIndex);) or we fail due to a timeout (new CancellationTokenSource(TimeSpan.FromMinutes(1))).

With your suggestion:

Or maybe just not use a while loop, but make a normal (not blocking query request), and if it fails just make another one, but a blocking query, using the LastIndex we got.

We could still have a flaky test, as it is very unlikely (but still possible) that the other tests running concurrently will change the state of the consul server, and the blocking query will return, although the information we are interested in is still not there. Hence repeating the blocking query in a while loop seems necessary.

I kinda missed that the timeout in cancellation token would cause the test to fail, so yep, it makes sense to me now!
Glad I was able to help!

@marcin-krystianc marcin-krystianc merged commit 1c27d01 into master Jul 19, 2024
249 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants