Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guard conditions: remove node from constructor, fix header #33

Merged
merged 12 commits into from
Apr 1, 2016

Conversation

jacquelinekay
Copy link
Contributor

While refactoring rclcpp to use rcl, I decided the guard conditions API had the following problems:

We shouldn't need to associate a guard condition with a node in its constructor or destructor.

The trigger function had a different name in the source and header files, causing a link error.

I'm also removing fixed guard conditions from the rmw API, so I'm changing the rmw functions called in the rcl code to reflect this.

@jacquelinekay jacquelinekay added the in progress Actively being worked on (Kanban column) label Mar 11, 2016
@jacquelinekay jacquelinekay self-assigned this Mar 11, 2016
@jacquelinekay jacquelinekay added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Mar 11, 2016
@@ -129,7 +129,7 @@ TEST_F(CLASSNAME(TestSubscriptionFixture, RMW_IMPLEMENTATION), test_subscription
// Process : test_subscription__rmw_opensplice_cpp <23524>
// Thread : main thread 7fff7342d000
// Internals : V6.4.140407OSS///v_topicNew/v_topic.c/448/21/1455157023.781423000
const char * topic = "chatter_int64";
const char * topic = "rcl_test_subscription_nominal_chatter_int64";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change in test topic? same in L188

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the topic name to be more specific to avoid cross-talk. Most tests in the test_rclcpp package also follow this convention.

@jacquelinekay
Copy link
Contributor Author

I rolled in another important change in 7bb5987: guard_condition.h declares a function "rcl_guard_condition_trigger", while guard_condition.c provides an implementation called "rcl_trigger_guard_condition".

@dirk-thomas
Copy link
Member

I think rcl_trigger_guard_condition is the better name. It starts with the namespace (rcl) followed by the verb get.

@jacquelinekay
Copy link
Contributor Author

Either one is fine with me. rcl_trigger_guard_condition also follows the rmw API call with the same name, and it also sounds more like plain English.

I thought that class name first might be a convention in rcl (many init and fini functions follow the convention of rcl_node_init and rcl_node_fini), but there's also many functions without class names (rcl_take and rcl_publish). So rcl_trigger_guard_condition should be fine.

@wjwwood
Copy link
Member

wjwwood commented Mar 13, 2016

Thanks for fixing that @jacquelinekay.

I think the headers name is better, since it can only be called on a guard condition. It is the equivalent of a class method and so I think it is appropriate to namespace it as such. Also it follows the convention of the rest of the functions like init and fini.

@wjwwood
Copy link
Member

wjwwood commented Mar 13, 2016

Well you've already changed it to match the c file, that's fine. Either is fine by me if only because it's otherwise so inconsistent.

@codebot
Copy link
Member

codebot commented Mar 15, 2016

TODO: investigate if adding/removing conditions to waitset is necessary, and adding/removing conditions

@jacquelinekay jacquelinekay changed the title Fixed guard conditions Remove fixed guard conditions Mar 17, 2016
@jacquelinekay jacquelinekay changed the title Remove fixed guard conditions Guard conditions: remove node from constructor, fix header Mar 17, 2016
@jacquelinekay
Copy link
Contributor Author

After some testing and discussion we've concluded that we don't actually need a distinction between fixed and non-fixed guard conditions.

The set of PRs associated with this one does the following:

Remove references to fixed guard conditions.

Make sure all guard conditions get reset (trigger state set back to untriggered) after rmw_wait, so that they can retrigger on each call to wait.

In rclcpp, add a guard condition to each node to wake up the waitset whenever a new waitable entity is added (this is for ros2/rclcpp#209).

Currently guard conditions are attached to the waitset before wait and detached after wait. I believe that this is not necessary and the guard conditions could be attached the DDS waitset when associated with a waitset and detached on destruction (for example, when a node is removed from an executor, its notify guard condition should be detached from the waitset for that executor). I think the same may apply for the read conditions for subscriptions, services, and clients. However, in order to keep my changes modular, I will ticket this and address it later, possibly after the rcl/rclcpp refactor.

@jacquelinekay
Copy link
Contributor Author

As a historical note, I initially added the idea of fixed guard conditions here, to fix a segfault due to shared ownership of the global ctrl-C guard condition:

ros2/rclcpp#166 (comment)

But after running the same test that originally motivated the fixed guard conditions over 200 times using these branches, I've concluded that this segfault is no longer present.

@jacquelinekay
Copy link
Contributor Author

ROS 2 Jenkins is pretty quiet right now, so I'm running a job with --repeat-until-fail 20 as a ctest arg, to check that we are avoiding making any tests flaky.
http://ci.ros2.org/job/ci_osx/884/

@dirk-thomas
Copy link
Member

If all guard conditions are always reset how will the following use case continue to work:

  • two subscribers, each provides a guard condition
  • both trigger due to a received message
  • wait wakes up and signals both ready
  • rcl* handles only one of the two incoming messages
  • goes to sleep again
  • wake doesn't wake up anymore?

@wjwwood
Copy link
Member

wjwwood commented Mar 18, 2016

Subscriptions don't have guard conditions? They are read conditions. Only guard conditions are reset after each wake up. Subscriptions can only be "reset" when you take from them.

@dirk-thomas
Copy link
Member

I see. That makes sense then. Sorry for the noise.

@wjwwood
Copy link
Member

wjwwood commented Mar 18, 2016

Apparently, I never hit send on this... wrote it like two days ago.


So @jacquelinekay and I discussed this off-line and came to some conclusions.

tl;dr We think the distinction between fixed and non-fixed guard conditions are not necessary anymore, but we're still a bit uncertain if the distinction is required in order to avoid attaching and detaching of guard conditions each loop of spin.

Fixed Guard Conditions

Fixed guard conditions exist to differentiate between guard conditions that should always be reset after waking up from rmw_wait and those that should not.

Some guard conditions only purpose is to interrupt a blocking call to rmw_wait, like the ctrl-c guard condition and the executor.cancel() guard condition. Since these guard conditions do not result in any other actions (like calling a callback), there is no good opportunity to reset them except immediately after waking up from a blocking wait. We call these "fixed" guard conditions and list them separately from all other guard conditions. We also do not detach these guard conditions from the wait set each iteration. See:

https://github.com/ros2/rmw_opensplice/blob/master/rmw_opensplice_cpp/src/rmw_wait.cpp#L96-L120

"Normal" Guard Conditions

Other guard conditions are never reset unless done so by the code outside of rmw. The are also detached and reattached in each call to rmw_wait as you can see in the above link as well. The question now is: do we ever use non-fixed guard conditions?

We use to use these normal guard conditions to implement timers, such that each timer had a thread and a guard condition. In the thread the timers would wait for a period and then trigger the guard condition. This would prevent rmw_wait from blocking until the guard condition was reset. The guard condition was reset when the executor "triggered the timer", which means it called a callback provided by the timer which would call the user's callback for the timer and then reset the guard condition, once again allowing rmw_wait to block. However, we no longer do it this way, and instead we use the timeout option to rmw_wait in combination with some bookkeeping about when the soonest timer is.

We would also probably need these normal guard conditions if we ever provided a general purpose guard condition to users via the client library API, e.g. in rclcpp. However, it's not clear to me if they should behave like fixed guard conditions (they get reset after each time rmw_wait wakes up) or as "normal" guard conditions where they remain triggered, preventing rmw_wait from blocking, until the user resets them.

@jacquelinekay and I talked about how a user might use this functionality and decided that providing both executor.cancel() and something like executor.call_later(callable) would cover any use case the user might have for a generic guard condition. The user can use executor.cancel() to interrupt a blocking call to spin_once or to cause a blocking call to spin to at least wake up and loop internally. The user can use executor.call_later(callable) to execute an arbitrary function on demand and as soon as possible (as soon as the spin call wakes up). Between these two general use cases, we couldn't come up with any other compelling reasons to have a general purpose guard condition. So if anyone has ideas that are different we'd like to hear them.

What to do

Based on the facts that we don't currently use any "normal" guard conditions in rclcpp and because we don't see an immediate need for them, even for a generic guard condition given to users, we think that we should not differentiate between different kinds of guard conditions and reset all guard conditions after waking up from spin. So basically treat all guard conditions passed through rcl and rmw as we currently treat "fixed" guard conditions.

However, we still need to look at when and why we are attaching and detaching conditions of all kinds to the DDS wait sets, and so there might be a reason hiding there that may encourage us to instead keep the distinction between fixed and non-fixed guard conditions.

Sorry for the long brain dump.

@jacquelinekay
Copy link
Contributor Author

@jacquelinekay jacquelinekay added in progress Actively being worked on (Kanban column) in review Waiting for review (Kanban column) and removed in review Waiting for review (Kanban column) in progress Actively being worked on (Kanban column) labels Mar 25, 2016
@jacquelinekay
Copy link
Contributor Author

hm. I think this might introduce some flaky tests (gtest_executor for connext and fastrtps).

@jacquelinekay jacquelinekay force-pushed the fixed_guard_conditions branch 2 times, most recently from 8525402 to b50ee33 Compare April 1, 2016 01:37
@jacquelinekay jacquelinekay merged commit 087315b into master Apr 1, 2016
@jacquelinekay jacquelinekay deleted the fixed_guard_conditions branch April 1, 2016 21:06
@jacquelinekay jacquelinekay removed the in review Waiting for review (Kanban column) label Apr 1, 2016
emersonknapp pushed a commit to aws-ros-dev/rcl that referenced this pull request Jun 3, 2019
* add calloc and zero initialized allocator function

* change string array to have proper init, hold allocator

* fix rcutils_split's use of string array, for now

* gitignore
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants