-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
👩🌾 get_node_names test failing on macOS #750
Comments
So, this looks like cross-talk between this test and previous tests that ran, but didn't properly cleanup after themselves. It also looks like it is a flake, since we've had no failures in the last 4 nights after this. There are 2 things I can think about doing here:
@ros2/team any thoughts? |
A while back one of the node names tests was failing because the code to set ROS_DOMAIN_ID on the CI machines wasn't working correctly, and this test's strictness is what caught that. I would be hesitant to relax the restriction, though I can see why that's attractive if there's a hard to locate cleanup issue. Since this test failed on an OSX machine and those are all on the same network, is it possible this is showing a regression in the code setting ROS_DOMAIN_ID? |
Imo we should definitely do this.
If we don't want to relax the test (which we could do too) we could also set the localhost-only option and use the domain coordinator to pick a different domain ID (which is not the one set in the environment variable).
|
I'm pretty sure this is my fault. I ran CI for some broken tests related to composable nodes that left some nodes running (hence the node name "mock_component_container"). AFAIK, this only affected our macOS machines since they are not containerized. I noticed these zombie nodes about a day after running the buggy test and since killed them. It's still possible to run into a similar issue in the future if a buggy test is run leaving behind stray nodes. So, it's probably still worth relaxing test or changing up the domain ID as @dirk-thomas suggests. |
We haven't seen this in a while, and we know the root cause. It can still happen in the future, but for now I'm going to close this bug out. |
Bug report
Required Info:
master
since 2020-08-16Steps to reproduce issue
Do the same setup as CI:
https://ci.ros2.org/job/nightly_osx_release/1754/consoleFull
Expected behavior
All tests pass
Actual behavior
Three
get_node_names
tests fail:rcl.TestGetNodeNames__rmw_fastrtps_cpp.test_rcl_get_node_names_with_enclave
rcl.TestGetNodeNames__rmw_fastrtps_cpp.test_rcl_get_node_names
projectroot.test.test_get_node_names__rmw_fastrtps_cpp
Additional information
These tests started failing on CI 2 days ago:
The 2 PRs merged to this repository since don't seem to be related to the failure (#746, #734 )
The text was updated successfully, but these errors were encountered: