executors should be able to operate on callback groups rather than nodes #519

wjwwood · 2018-07-19T21:57:14Z

Right now, you can only add whole nodes to an executor, but it's come up in a few scenarios where you'd rather have the ability to add selective callback groups to an executor rather than the whole node. This would allow you to do things like have a separate executor for one subscription and the main executor can handle everything else in the node. This is also more similar to how callback queues worked in ROS 1.

Some things to consider:

currently the executor assumes it doesn't share nodes with other executors
the guard condition that interrupts an executor when new things are added (e.g. new timer) is owned by the node, but now it would need to be owned by the callback group (potentially increasing the total number of guard conditions)
callback groups should still only be associated with a single executor, and that logic is currently in nodes (so it will need to be moved)

TODO: - Could not use CallbackGroups as a replacement for CallbackQueues in ROS2, this requires ros2/rclcpp#519 and with that could use callback groups + executors like a callback queue from ROS 1. - Could not replace boost::signal2::signal in ROS2 as std::signal2 is not available. Need change design to not use boot::signal2::signal Workaround: - c++11 not support std::shared_lock, use unique_lock as workaround - no replacement of ros::WallTime::now on ROS2, use tf2::get_now instead. - c++11 has no std::upgrade_lock and upgrade_to_unique_lock, use std::unique_lock as workaround Signed-off-by: Chris Ye <[email protected]>

ralph-lange · 2019-02-07T14:00:35Z

I would be happy to make a PR on the callback-group-level Executor presented at ROSCon 2018. Recently, I updated the implementation to Crystal, cf. https://github.com/boschresearch/ros2_rclcpp/commits/cbg-executor-0.6.1
However, this is a prototypical implementation only, i.e. this PR can be only the basis for a discussion of the necessary refactoring. And, in the last weeks, I came to the conclusion that as a first step it is even more important to improve the current implementation regarding predictability rather than to refine the Executor API. The behavior of the Executor deviates substantially from the ROS 1 queueing mechanism, which is basically FIFO (with some exceptions in case of buffer overflows). In the ROS 2 Executor class, the get_next_ready_executable function (https://github.com/ros2/rclcpp/blob/master/rclcpp/src/rclcpp/executor.cpp#L534) prioritizes timers over subscriptions over service invocations and so on. Inside the collect_entities function of the memory strategy (https://github.com/ros2/rclcpp/blob/master/rclcpp/include/rclcpp/strategies/allocator_memory_strategy.hpp#L153) the order seems to depend on the insertion order of the nodes rather than the arrival order on the middleware layer. A colleague of mine recently developed a formal scheduling model for the current mechanism. I'll include him in the discussion in the next days.
In my opinion, once we have a clear understanding of the current mechanism (which goes down to the middleware), we should first develop a design for the intended model(s) before we discuss about an improved Executor API and granularity.

peterpena · 2020-06-22T20:59:21Z

@ralph-lange I would like to work on your callback-group-level Executor in cbg-executor-foxy branch so that it can merge with upstream if that is fine with you.

ralph-lange · 2020-06-23T07:23:16Z

That's of course fine for me - and I'm happy to support you.

ralph-lange · 2020-06-23T07:25:57Z

@peterpena, please note that the current port to Foxy is not yet working properly. Somewhere in the rclcpp executor or my callback-groups extension of it, a very subtle change must have come in:
In my example, I have two nodes (Ping and Pong) each with two callback groups "RT" and "BE" (for real-time and best-effort). These are assigned to two different executor instances executor_be and executor_rt, which run together on one core. In the current version, the higher-priority thread for executor_rt takes up 100% of CPU, so executor_be does not get any CPU time at all.
The high computing demand occurs when not all callback groups are assigned to a node of the same executor instance. In my example, in the case of the ping node, this includes the actually unused default callback group. Therefore, I have added this one: https://github.com/boschresearch/ros2_examples/commit/06d1f5960a112255a622d655dbdbc62e7d0f94ad
Actually, this is needed in the ping node only. So I removed it from the pong node: https://github.com/boschresearch/ros2_examples/commit/281f7eb9f8d8f61bfb0c9cd73a449eded152e42b

You can try it out yourself, by playing around with the five lines from https://github.com/boschresearch/ros2_examples/blob/cbg-executor-foxy/rclcpp/cbg-executor_ping-pong/main.cpp#L124 on.
If you use only one executor in lines 124 to 126 and 139 to 140, then it works. If you use different executors (as in the current implementation) or comment out a line, the thread of that executor will suddenly consume all available CPU power.
For example, if you consistently use executor_rt in 124 to 126 and comment out in 139, the best-effort thread for executor_be will consume almost 100% of the processing power, while the higher-priority thread for executor_rt will function normally and consume only a few milliseconds in a 10-second experiment.
To start the example application with patched executor, please use

ros2 run cbg-executor_ping-pong_cpp ping-pong io 100000 100000 1 1

My thoughts on this behavior:

That's a pretty subtle behavior. So far I have no idea how my changes to the executor will make this happen. My gut feeling tells me that the cause is to be found in the rclcpp executor itself (and not in the stack below), because in the rclc executor we work with single callback registrations without problems.
It is not clear to me why the problem with the default callback group only occurs at the ping node. I mean not to use the default callback group there either.
With tracing tools, it should be possible to find out quite fast where an executor running on 100% CPU performance spends his time. So far, I simply had no time to perform such an analysis.

peterpena · 2020-06-25T17:23:17Z

Hello @ralph-lange, thank you for the thorough explanation on how to test the executor-callback group feature. Based on your comment and suggestion, I have been carefully looking into the behavior of add_callback_group. There is one thing that might cause a bug and might be related to the behavior you are describing:

With execute::add_callback_group, the callback group is added to a map weak_groups_to_nodes with the corresponding node. When allocator_memory_strategy::collect_entities is called it uses this map to iterate through all the groups of the nodes in the map and checks they are not nullptr.

This is all great, but after, it calls node->get_callback_groups() from the node found in the map, and adds the corresponding handles. The problem with this is if a callback was not added to the executor, then get_callback_groups might iterate through a callback group that belongs to another executor. Essentially, an executor might beat another executor in collect_entities and adds a handle from a callback that does not belong to it. In the case of the bug you are describing, the higher priority thread might be collecting the handles from the other executor, and executing while the lower priority thread is always waking up and finding that it is not ready. I think rather than iterating through the get_callback_groups for each node, we only want to iterate through the groups found in weak_groups_to_nodes. I am not sure if I am missing something or not understanding correctly, but this seems like behavior we do not want. What do you think?

I am still looking through the code and working on the branch. I will keep you updated with other findings.

ralph-lange · 2020-06-26T11:08:14Z

Thank you for reporting this finding, @peterpena and for looking even deeper into my code. I always wanted to compare the changes for Crystal (i.e. boschresearch@08e9a1e) with the changes for Foxy (i.e. boschresearch@807b06c) side-by-side but simply had no time so far.

ralph-lange · 2020-06-29T19:32:35Z

@peterpena, you were absolutely right. Removing the node->get_callback_groups from collect_entities fixed the problem. Thank you very much for pointing to this bug! I just rebased the whole implementation of the Callback-group-level Executor, including this fix, to v2.0.1 of rclcpp in the branch https://github.com/boschresearch/ros2_rclcpp/tree/cbg-executor-foxy.

ralph-lange · 2020-07-13T05:55:49Z

Relates to PR #1218.

peterpena · 2020-07-13T17:03:02Z

@ralph-lange That is great to hear. Thank you for mentioning the PR here.

wjwwood · 2020-09-02T03:44:28Z

Follow up issue: #1287

rcl_action_process_cancel_request() will never return RCL_RET_ACTION_SERVER_TAKE_FAILED. Signed-off-by: Jacob Perron <[email protected]>

Signed-off-by: Emerson Knapp <[email protected]>

ros-discourse · 2023-09-14T06:39:11Z

This issue has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/threaded-callback-with-priority-affinity-and-overrun-handler/14977/7

wjwwood added the enhancement New feature or request label Jul 19, 2018

wjwwood mentioned this issue Jul 19, 2018

add ability to wait on subscriptions without an executor #520

Closed

mikaelarguedas mentioned this issue Aug 9, 2018

Node has a reference to its executor ros2/rclpy#218

Merged

wjwwood mentioned this issue Oct 26, 2018

Is there replace way to call connection method with signals2 and ros::CallbackQueueInterface ros2/message_filters#11

Closed

yechun1 mentioned this issue Oct 31, 2018

Porting tf2_ros/include/message_filter.h to ROS2. ros2/geometry2#59

Closed

sloretz mentioned this issue Nov 20, 2018

Add thing which adds things to wait set #588

Closed

tobiasblass mentioned this issue Feb 18, 2019

Document/Clarify executor processing order #633

Open

Karsten1987 mentioned this issue Aug 1, 2019

Add rclcpp::spin() version with executor reference #764

Closed

MartinCornelis2 mentioned this issue Aug 15, 2019

Optimization of the SingleThreadedExecutor and MultiThreadedExecutor to reduce CPU overhead. #825

Open

henningkayser mentioned this issue Jan 15, 2020

Port planning_scene_monitor to ROS2 moveit/moveit2#112

Merged

wjwwood mentioned this issue Apr 1, 2020

[rclcpp] add WaitSet class and modify entities to work without executor #1047

Merged

3 tasks

wjwwood self-assigned this Apr 14, 2020

wjwwood linked a pull request Jul 13, 2020 that will close this issue

Adding callback groups in executor #1218

Merged

henningkayser mentioned this issue Jul 23, 2020

Planning Scene Monitor Node Executor moveit/moveit2#230

Merged

4 tasks

wjwwood closed this as completed in #1218 Sep 2, 2020

carlossvg mentioned this issue Sep 15, 2020

Threaded Callback with priority, affinity and overrun handler ros-realtime/community#4

Closed

1 task

nnmm pushed a commit to ApexAI/rclcpp that referenced this issue Jul 9, 2022

[rcl_action] Correct action server documentation (ros2#519)

b7610d9

rcl_action_process_cancel_request() will never return RCL_RET_ACTION_SERVER_TAKE_FAILED. Signed-off-by: Jacob Perron <[email protected]>

DensoADAS pushed a commit to DensoADAS/rclcpp that referenced this issue Aug 5, 2022

Add CODEOWNERS file (ros2#519)

56e440c

Signed-off-by: Emerson Knapp <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

executors should be able to operate on callback groups rather than nodes #519

executors should be able to operate on callback groups rather than nodes #519

wjwwood commented Jul 19, 2018

ralph-lange commented Feb 7, 2019

peterpena commented Jun 22, 2020

ralph-lange commented Jun 23, 2020

ralph-lange commented Jun 23, 2020

peterpena commented Jun 25, 2020 •

edited

Loading

ralph-lange commented Jun 26, 2020

ralph-lange commented Jun 29, 2020

ralph-lange commented Jul 13, 2020

peterpena commented Jul 13, 2020

wjwwood commented Sep 2, 2020

ros-discourse commented Sep 14, 2023

executors should be able to operate on callback groups rather than nodes #519

executors should be able to operate on callback groups rather than nodes #519

Comments

wjwwood commented Jul 19, 2018

ralph-lange commented Feb 7, 2019

peterpena commented Jun 22, 2020

ralph-lange commented Jun 23, 2020

ralph-lange commented Jun 23, 2020

peterpena commented Jun 25, 2020 • edited Loading

ralph-lange commented Jun 26, 2020

ralph-lange commented Jun 29, 2020

ralph-lange commented Jul 13, 2020

peterpena commented Jul 13, 2020

wjwwood commented Sep 2, 2020

ros-discourse commented Sep 14, 2023

peterpena commented Jun 25, 2020 •

edited

Loading