Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executors should be able to operate on callback groups rather than nodes #519

Closed
wjwwood opened this issue Jul 19, 2018 · 11 comments · Fixed by #1218
Closed

executors should be able to operate on callback groups rather than nodes #519

wjwwood opened this issue Jul 19, 2018 · 11 comments · Fixed by #1218
Assignees
Labels
enhancement New feature or request

Comments

@wjwwood
Copy link
Member

wjwwood commented Jul 19, 2018

Right now, you can only add whole nodes to an executor, but it's come up in a few scenarios where you'd rather have the ability to add selective callback groups to an executor rather than the whole node. This would allow you to do things like have a separate executor for one subscription and the main executor can handle everything else in the node. This is also more similar to how callback queues worked in ROS 1.

Some things to consider:

  • currently the executor assumes it doesn't share nodes with other executors
  • the guard condition that interrupts an executor when new things are added (e.g. new timer) is owned by the node, but now it would need to be owned by the callback group (potentially increasing the total number of guard conditions)
  • callback groups should still only be associated with a single executor, and that logic is currently in nodes (so it will need to be moved)
@wjwwood wjwwood added the enhancement New feature or request label Jul 19, 2018
yechun1 added a commit to yechun1/geometry2 that referenced this issue Oct 31, 2018
TODO:
   - Could not use CallbackGroups as a replacement for CallbackQueues in ROS2, this requires ros2/rclcpp#519 and with that could use callback groups + executors like a callback queue from ROS 1.
   - Could not replace boost::signal2::signal in ROS2 as std::signal2 is not available. Need change design to not use boot::signal2::signal

Workaround:
   - c++11 not support std::shared_lock, use unique_lock as workaround
   - no replacement of ros::WallTime::now on ROS2, use tf2::get_now instead.
   - c++11 has no std::upgrade_lock and upgrade_to_unique_lock, use std::unique_lock as workaround

Signed-off-by: Chris Ye <[email protected]>
yechun1 added a commit to yechun1/geometry2 that referenced this issue Oct 31, 2018
TODO:
   - Could not use CallbackGroups as a replacement for CallbackQueues in ROS2, this requires ros2/rclcpp#519 and with that could use callback groups + executors like a callback queue from ROS 1.
   - Could not replace boost::signal2::signal in ROS2 as std::signal2 is not available. Need change design to not use boot::signal2::signal

Workaround:
   - c++11 not support std::shared_lock, use unique_lock as workaround
   - no replacement of ros::WallTime::now on ROS2, use tf2::get_now instead.
   - c++11 has no std::upgrade_lock and upgrade_to_unique_lock, use std::unique_lock as workaround

Signed-off-by: Chris Ye <[email protected]>
yechun1 added a commit to yechun1/geometry2 that referenced this issue Nov 1, 2018
TODO:
   - Could not use CallbackGroups as a replacement for CallbackQueues in ROS2, this requires ros2/rclcpp#519 and with that could use callback groups + executors like a callback queue from ROS 1.
   - Could not replace boost::signal2::signal in ROS2 as std::signal2 is not available. Need change design to not use boot::signal2::signal

Workaround:
   - c++11 not support std::shared_lock, use unique_lock as workaround
   - no replacement of ros::WallTime::now on ROS2, use tf2::get_now instead.
   - c++11 has no std::upgrade_lock and upgrade_to_unique_lock, use std::unique_lock as workaround

Signed-off-by: Chris Ye <[email protected]>
yechun1 added a commit to yechun1/geometry2 that referenced this issue Nov 1, 2018
TODO:
   - Could not use CallbackGroups as a replacement for CallbackQueues in ROS2, this requires ros2/rclcpp#519 and with that could use callback groups + executors like a callback queue from ROS 1.
   - Could not replace boost::signal2::signal in ROS2 as std::signal2 is not available. Need change design to not use boot::signal2::signal

Workaround:
   - c++11 not support std::shared_lock, use unique_lock as workaround
   - no replacement of ros::WallTime::now on ROS2, use tf2::get_now instead.
   - c++11 has no std::upgrade_lock and upgrade_to_unique_lock, use std::unique_lock as workaround

Signed-off-by: Chris Ye <[email protected]>
@ralph-lange
Copy link
Contributor

I would be happy to make a PR on the callback-group-level Executor presented at ROSCon 2018. Recently, I updated the implementation to Crystal, cf. https://github.com/boschresearch/ros2_rclcpp/commits/cbg-executor-0.6.1
However, this is a prototypical implementation only, i.e. this PR can be only the basis for a discussion of the necessary refactoring. And, in the last weeks, I came to the conclusion that as a first step it is even more important to improve the current implementation regarding predictability rather than to refine the Executor API. The behavior of the Executor deviates substantially from the ROS 1 queueing mechanism, which is basically FIFO (with some exceptions in case of buffer overflows). In the ROS 2 Executor class, the get_next_ready_executable function (https://github.com/ros2/rclcpp/blob/master/rclcpp/src/rclcpp/executor.cpp#L534) prioritizes timers over subscriptions over service invocations and so on. Inside the collect_entities function of the memory strategy (https://github.com/ros2/rclcpp/blob/master/rclcpp/include/rclcpp/strategies/allocator_memory_strategy.hpp#L153) the order seems to depend on the insertion order of the nodes rather than the arrival order on the middleware layer. A colleague of mine recently developed a formal scheduling model for the current mechanism. I'll include him in the discussion in the next days.
In my opinion, once we have a clear understanding of the current mechanism (which goes down to the middleware), we should first develop a design for the intended model(s) before we discuss about an improved Executor API and granularity.

@peterpena
Copy link
Contributor

@ralph-lange I would like to work on your callback-group-level Executor in cbg-executor-foxy branch so that it can merge with upstream if that is fine with you.

@ralph-lange
Copy link
Contributor

That's of course fine for me - and I'm happy to support you.

@ralph-lange
Copy link
Contributor

@peterpena, please note that the current port to Foxy is not yet working properly. Somewhere in the rclcpp executor or my callback-groups extension of it, a very subtle change must have come in:
In my example, I have two nodes (Ping and Pong) each with two callback groups "RT" and "BE" (for real-time and best-effort). These are assigned to two different executor instances executor_be and executor_rt, which run together on one core. In the current version, the higher-priority thread for executor_rt takes up 100% of CPU, so executor_be does not get any CPU time at all.
The high computing demand occurs when not all callback groups are assigned to a node of the same executor instance. In my example, in the case of the ping node, this includes the actually unused default callback group. Therefore, I have added this one: https://github.com/boschresearch/ros2_examples/commit/06d1f5960a112255a622d655dbdbc62e7d0f94ad
Actually, this is needed in the ping node only. So I removed it from the pong node: https://github.com/boschresearch/ros2_examples/commit/281f7eb9f8d8f61bfb0c9cd73a449eded152e42b

You can try it out yourself, by playing around with the five lines from https://github.com/boschresearch/ros2_examples/blob/cbg-executor-foxy/rclcpp/cbg-executor_ping-pong/main.cpp#L124 on.
If you use only one executor in lines 124 to 126 and 139 to 140, then it works. If you use different executors (as in the current implementation) or comment out a line, the thread of that executor will suddenly consume all available CPU power.
For example, if you consistently use executor_rt in 124 to 126 and comment out in 139, the best-effort thread for executor_be will consume almost 100% of the processing power, while the higher-priority thread for executor_rt will function normally and consume only a few milliseconds in a 10-second experiment.
To start the example application with patched executor, please use

ros2 run cbg-executor_ping-pong_cpp ping-pong io 100000 100000 1 1

My thoughts on this behavior:

  1. That's a pretty subtle behavior. So far I have no idea how my changes to the executor will make this happen. My gut feeling tells me that the cause is to be found in the rclcpp executor itself (and not in the stack below), because in the rclc executor we work with single callback registrations without problems.
  2. It is not clear to me why the problem with the default callback group only occurs at the ping node. I mean not to use the default callback group there either.
  3. With tracing tools, it should be possible to find out quite fast where an executor running on 100% CPU performance spends his time. So far, I simply had no time to perform such an analysis.

@peterpena
Copy link
Contributor

peterpena commented Jun 25, 2020

Hello @ralph-lange, thank you for the thorough explanation on how to test the executor-callback group feature. Based on your comment and suggestion, I have been carefully looking into the behavior of add_callback_group. There is one thing that might cause a bug and might be related to the behavior you are describing:

With execute::add_callback_group, the callback group is added to a map weak_groups_to_nodes with the corresponding node. When allocator_memory_strategy::collect_entities is called it uses this map to iterate through all the groups of the nodes in the map and checks they are not nullptr.

This is all great, but after, it calls node->get_callback_groups() from the node found in the map, and adds the corresponding handles. The problem with this is if a callback was not added to the executor, then get_callback_groups might iterate through a callback group that belongs to another executor. Essentially, an executor might beat another executor in collect_entities and adds a handle from a callback that does not belong to it. In the case of the bug you are describing, the higher priority thread might be collecting the handles from the other executor, and executing while the lower priority thread is always waking up and finding that it is not ready. I think rather than iterating through the get_callback_groups for each node, we only want to iterate through the groups found in weak_groups_to_nodes. I am not sure if I am missing something or not understanding correctly, but this seems like behavior we do not want. What do you think?

I am still looking through the code and working on the branch. I will keep you updated with other findings.

@ralph-lange
Copy link
Contributor

Thank you for reporting this finding, @peterpena and for looking even deeper into my code. I always wanted to compare the changes for Crystal (i.e. boschresearch@08e9a1e) with the changes for Foxy (i.e. boschresearch@807b06c) side-by-side but simply had no time so far.

@ralph-lange
Copy link
Contributor

@peterpena, you were absolutely right. Removing the node->get_callback_groups from collect_entities fixed the problem. Thank you very much for pointing to this bug! I just rebased the whole implementation of the Callback-group-level Executor, including this fix, to v2.0.1 of rclcpp in the branch https://github.com/boschresearch/ros2_rclcpp/tree/cbg-executor-foxy.

@ralph-lange
Copy link
Contributor

Relates to PR #1218.

@peterpena
Copy link
Contributor

@ralph-lange That is great to hear. Thank you for mentioning the PR here.

@wjwwood
Copy link
Member Author

wjwwood commented Sep 2, 2020

Follow up issue: #1287

nnmm pushed a commit to ApexAI/rclcpp that referenced this issue Jul 9, 2022
rcl_action_process_cancel_request() will never return RCL_RET_ACTION_SERVER_TAKE_FAILED.

Signed-off-by: Jacob Perron <[email protected]>
DensoADAS pushed a commit to DensoADAS/rclcpp that referenced this issue Aug 5, 2022
Signed-off-by: Emerson Knapp <[email protected]>
@ros-discourse
Copy link

This issue has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/threaded-callback-with-priority-affinity-and-overrun-handler/14977/7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants