Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add mqtt docker integration test #4556

Merged
merged 10 commits into from
Mar 17, 2022
Merged

feat: add mqtt docker integration test #4556

merged 10 commits into from
Mar 17, 2022

Conversation

onelson
Copy link
Contributor

@onelson onelson commented Mar 12, 2022

Fixes #4518

The heft of this diff:

  • configures an mqtt dialer dependency for test
  • adds a new integration test for mqtt.publish
  • adds the new test to the group of "write" tests listed in Makefile so that the test can be included/excluded by various test make targets
  • adds an mqtt broker service to the script we use to launch docker containers for the integration tests

Additionally, there are fixes for the bash script which had previously undergone some refactors that left it working only for some cases (ref: #4518 (comment))

Another set of PRs will be needed to update the skip lists for various CI jobs outside of the flux repo.

Done checklist

  • docs/SPEC.md updated N/A
  • Test cases written

@onelson onelson marked this pull request as ready for review March 14, 2022 17:18
@onelson onelson requested review from a team as code owners March 14, 2022 17:18
@onelson onelson requested review from nathanielc and noramullen1 and removed request for a team March 14, 2022 17:18
qos: 0,
retain: false,
clientid: "fluxtest",
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty straightforward, re our conversation in slack, did you end up playing any games with retain etc to ensure the message was published? Reading this test looks like we are relying on the client to report success/failure instead of validating through the server. Is that correct?

This is likely good enough because we are using a standard client and we don't need to test if the server itself is working rather just that our client can successfully communicate with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I ran into is we have no API in flux to subscribe or otherwise read from MQTT, so there's no reasonable way to verify the outcome from within fluxtest alone. You'd need to be subscribes/watching in another process, or... something else.

Proving the dialer pooling behavior for that other issue might be a challenge, but as a manual exercise we can at least watch the logs on the server to look at client connects/disconnects.

For an automated test, the best I could come up with was to assert we got the true back from the client.

This test does show an error if MQTT isn't running, but I don't know what the rest of the failure space looks like 🤷

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addendum: I did use the spawner script to launch the broker and was able to watch (via a cli subscriber) messages hit the queue while using mqtt.publish from the REPL. (video in slack)

@onelson
Copy link
Contributor Author

onelson commented Mar 15, 2022

@nathanielc any other questions I can answer or changes to be made here?

Owen Nelson added 9 commits March 17, 2022 09:10
Looks like when the script was refactored to allow for individual
services to be launched, the "all" case was not well tested. As such,
when we didn't name any services to launch nothing would launch.

After fixing this issue, it seems the hdb container doesn't launch
properly in the current state of the script. Something about having
those docker commands in a function or the function being sent to the
background disconnects it from the main tty for the script causing it to
fail to perform the readiness check properly.

In practice it seemed this meant hdb would report "ready" too soon and
the seed data insert would fail, as well as the tests.

For now, I'm skipping hdb. We'll have to loop back to try and find a
good alternative approach.
This test currently fails because the client dependency isn't injected.
Sometimes when we run `docker stop` the engine can take a beat before
the container is removed from the active list. The follow-up `docker rm`
can therefore try to destroy the container while it's stopping (and
`rm`'ing itself).

In such a case, ignore the failure. If the container truly sticks around
beyond these 2 commands, we'll see a different failure when we try to
relaunch the container with an identical name.
@onelson
Copy link
Contributor Author

onelson commented Mar 17, 2022

@jsternberg wanted to ping you (for when you get back from holiday). We ran into some test failures in nightly (and here) with the new "closer" dialer.

#4568 was our best guess for what should happen to make the dialer safe to use in test, but you might have a different idea for where to handle this. Seemed like this call needs to happen after the rest of the deps have been injected while also as close to the top of the call stack as possible.

@onelson onelson merged commit a9ed037 into master Mar 17, 2022
@onelson onelson deleted the onelson/mqtt-docker branch March 17, 2022 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add MQTT integration test(s) w/ Docker
3 participants