Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix flaky integration tests in xample-domain #2931

Merged
merged 30 commits into from
May 3, 2024

Conversation

lisac
Copy link
Contributor

@lisac lisac commented Apr 30, 2024

What was the problem?

The continuous Integration (CI) tests for xample-domain are flaky: they appear to fail randomly for no apparent reason.

As reported in #2345, this error is consistent in the logs of those failed tests:

o.s.a.r.l.SimpleMessageListenerContainer : Failed to check/redeclare auto-delete queue(s). [...] Unexpected ERROR logs in svc-xample-j container.

Theory: 1) the domain-xample integration test code (java) occasionally fails to connect to the rabbitMQ service, even if the service is observed to be running*; 2) initial connection failures are not uncommon; and 3) that if given a modest number of retry attempts, the connection would succeed.

Associated tickets or Slack threads:

How does this fix it?

  1. Introduce a retry policy on rabbitMQ actions, limited to the scope of the integration test.
  2. Refine error interpretation: rather than assess an integration test failure if the word "ERROR" is found in a log file, ignore cases of "ERROR" that are associated with AMQP's "SimpleMessageListenerContainer" - with the theory that initial connection failures are not uncommon.
  3. To mitigate the previous item being too lax, assess an integration failure if the word "Test failures:" is found in the log file.. (test failures will in fact cause the the log file assessment to be skipped)

This PR has an additional change to the integration test, I think minor (removal of an action I thought redundant).

How to test this PR

  • in particular, please assess whether the change to the error interpretation could be more refined
  • Step 21

*the CI workflow verifies the rabbitMQ service is ready before moving on to executing the integration tests.

- name: 'Check for RabbitMQ to be ready'
uses: department-of-veterans-affairs/[email protected]
with:
url: 'http://localhost:15672/api/vhosts'
method: 'GET'

Footnotes

  1. To check if a PR will succeed in the SecRel workflow, test PRs in the SecRel pipeline.

This comment was marked as off-topic.

This comment was marked as off-topic.

@lisac
Copy link
Contributor Author

lisac commented May 3, 2024

verified that the CI fails (failure was intentional, per 93cb964) - although wasn't able to get insight into the log file handling, as that step was skipped, due to the failure detected at the integration test step.

Will rely on code review for assessing whether the change to log file handling is acceptable.

@lisac lisac self-assigned this May 3, 2024
rabbitAdmin.purgeQueue(queueName, true);
@BeforeEach
private void setUp() {
rabbitTemplate.setRetryTemplate(retryTemplate);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This placement of setRetryTemplate() is not ideal, for a few reasons:

  1. it should be executed just once, such as in a @BeforeAll function. However, the method signature for @BeforeAll is static, which is incompatible with the rabbitTemplate instance.
  2. it should be configured in spring (or whatever it is that sets up rabbitTemplate)

... and I'm stumped on how to implement either of these. Very open to pointers. Thanks.

Comment on lines -41 to -42
private void tearDown() {
rabbitAdmin.purgeQueue(queueName, true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i removed this method as I thought it redundant to the purgeQueue() performed in setUp()

This comment was marked as outdated.

@lisac lisac changed the title [draft] fix flaky integration tests in xample-domain fix flaky integration tests in xample-domain May 3, 2024
@lisac lisac marked this pull request as ready for review May 3, 2024 03:20
@lisac lisac requested a review from a team as a code owner May 3, 2024 03:20
@lisac lisac requested review from tejans24 and msnwatson May 3, 2024 03:20
lisac and others added 14 commits May 3, 2024 09:19
… retries) initial connection failures for rabbitmq being interpreted as test failures?
although add condition that we will discard those related to AMQP's SimpleMessageListenerContainer
and remove standalone unit test, which itself feels inappropriate (it's testing rabbitMQ functionality, which seems inappropriate for an integratino test
why: verify that it's exposed by the log file handling
the intent had been to catch integration test failures. however, if there are any integration test failures, this step won't be executed.
@lisac lisac force-pushed the lisac/integration-tests-xample-app branch from 3dfab3c to f04146e Compare May 3, 2024 13:19
@msnwatson
Copy link
Contributor

Great questions in this PR about implementation. I know the static method stuff I've bumped into a lot and don't really have an answer and unfortunately I'm not much of a spring expert either.

@lisac
Copy link
Contributor Author

lisac commented May 3, 2024

Proceeding with merging. The implementation could be better. On the other hand, it's scoped to a test class that wouldn't affect production; and it is effective with respect to reducing flakiness. Given my limited springboot comfort, I'm not confident in the value of spending more time on this.

Forgive me, whoever next handles this code.

@lisac lisac merged commit a3058cc into develop May 3, 2024
1 check passed
@lisac lisac deleted the lisac/integration-tests-xample-app branch May 3, 2024 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants