Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable rosbridge_library test in CI #686

Merged
merged 2 commits into from
Dec 10, 2021
Merged

Conversation

jtbandes
Copy link
Member

Another test issue was encountered on the ros build farm, and it turns out the test wasn't even running with colcon test, nor in this repo's CI job. I don't know why it's necessary, but adding find_package(ament_cmake_ros REQUIRED) to the package's CMakeLists seems to fix the issue.

I've been testing locally with:

colcon build && colcon test --packages-select rosbridge_library --event-handlers console_direct+

@jtbandes
Copy link
Member Author

@DomenicP the test_services.py is now hanging. I think the problem was introduced in #669. Any ideas for how to fix the test?

@jtbandes jtbandes added the ros2 ros2 label Nov 29, 2021
@DomenicP
Copy link
Contributor

DomenicP commented Dec 6, 2021

@jtbandes I think the test might be stuck in a deadlock on this line:

json_ret = services.call_service(self.node, self.node.get_name() + "/list_parameters")

In the websocket server all calls to services.call_service() are run in a background thread, but in this test it looks like it's being called from the main thread which blocks and causes the timeout. It looks like the rest of the tests use the ServiceTester class in this same file which is using a rosbridge_library.internal.services.ServiceCaller which inherits from threading.Thread.

A short term fix might be to modify the test that is timed out to run its service call in a background thread instead of the main thread. Longer term this relates back to the discussion from #669 regarding a better architecture for this code using ROS2 async patterns instead of threads.

@zflat
Copy link
Contributor

zflat commented Dec 8, 2021

@jtbandes I think the test might be stuck in a deadlock on this line:

json_ret = services.call_service(self.node, self.node.get_name() + "/list_parameters")

It is not quite in a deadlock. The issue is that nothing is calling spin and the service being called is on the same node that is making the call. Without calling spin the service handler does not execute and so the call to the service just waits until the timeout.

Here is a quick way to get test_service_call passing:

diff --git a/rosbridge_library/test/internal/test_services.py b/rosbridge_library/test/internal/test_services.py
index d96638b..42a2dea 100755
--- a/rosbridge_library/test/internal/test_services.py
+++ b/rosbridge_library/test/internal/test_services.py
@@ -2,6 +2,7 @@
 import random
 import time
 import unittest
+import threading
 
 import numpy as np
 import rclpy
@@ -146,6 +147,10 @@ class TestServices(unittest.TestCase):
         ret = p.call_async(ListParameters.Request())
         rclpy.spin_until_future_complete(self.node, ret)
 
+        # Spin in a separate thread
+        thread = threading.Thread(target=rclpy.spin, args=(self.node, ), daemon=True)
+        thread.start()
+
         # Now, call using the services
         json_ret = services.call_service(self.node, self.node.get_name() + "/list_parameters")
         for x, y in zip(ret.result().result.names, json_ret["result"]["names"]):

Apply that patch and comment out the other blocking tests and you will see that it runs.

For the purpose of fixing this test, it may be possible to do something like this:
https://answers.ros.org/question/358343/rate-and-sleep-function-in-rclpy-library-for-ros2/?answer=358386#post-id-358386
Or as mentioned in the above answer, look to this as an example: https://github.com/ros2/rclpy/blob/75c51c945d6029bea61571ced43c1d1b9c3c6703/rclpy/test/test_rate.py

Also, using a multithreaded executor may fix the issue but I have not tried that out.


A short term fix might be to modify the test that is timed out to run its service call in a background thread instead of the main thread. Longer term this relates back to the discussion from #669 regarding a better architecture for this code using ROS2 async patterns instead of threads.

Agreed

@jtbandes
Copy link
Member Author

Going to merge the fix for now and we can work on enabling the test in a follow-up PR.

cc @kenji-miyake who worked on this test previously

@jtbandes jtbandes merged commit b5f7e3d into ros2 Dec 10, 2021
@jtbandes jtbandes deleted the jacob/rosbridge_library-test branch December 10, 2021 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ros2 ros2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants