-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lifecyle node and C state_machine #265
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be architecturally nicer if the state machine itself could be used to trigger the appropriate callbacks in the node. Then the external API would just have to trigger a transition from the current state, and leave the rest up to the machinery of the (generic) state machine implementation. This would simplify the implementation of the managed lifecycle node class.
* Pure virtual functions as defined in | ||
* http://design.ros2.org/articles/node_lifecycle.html | ||
*/ | ||
class NodeInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not give the interface a decent name rather than putting it in a namespace? How about something like "ManagedLifecycleNodeInterface"?
/*static*/ const rcl_state_t LIFECYCLE_EXPORT rcl_state_inactive = {/*.state = */1, /*.label = */"inactive"}; | ||
/*static*/ const rcl_state_t LIFECYCLE_EXPORT rcl_state_active = {/*.state = */2, /*.label = */"active"}; | ||
/*static*/ const rcl_state_t LIFECYCLE_EXPORT rcl_state_finalized = {/*.state = */3, /*.label = */"finalized"}; | ||
/*static*/ const rcl_state_t LIFECYCLE_EXPORT rcl_state_error = {/*.state = */4, /*.label = */"error"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are the following states:
CleaningUp
Configuring
Activating
Deactivating
ShuttingDown
ErrorProcessing
appears to have been renamed to error
. Not a problem, but we should update the design document to match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only capture the 4 primary states. But yes, you're right. The remaining transition states will be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could argue that the transition states don't need to be actual states. The implementation of the state machine just needs to make sure the correct functions are called as it transitions between the "main" states.
However, having them as actual states would be useful for debugging, and it might make it easier to understand the execution time usage of a managed node that has real-time characteristics.
typedef struct LIFECYCLE_EXPORT _index | ||
{ | ||
unsigned int index; | ||
const char* label; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Label of what? The state? If we have the state index why do we need the label as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
label
here was meant to be a human readable string such as "active" or "inactive".
* @brief LifecycleNode as child class of rclcpp Node | ||
* has lifecycle nodeinterface for configuring this node. | ||
*/ | ||
class LifecycleNode : public rclcpp::node::Node, public lifecycle_interface::NodeInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is starting to show the problem of not having a Node
interface. "Favor 'object composition' over 'class inheritance'" is often thrown around and abused, but C++ best practice generally says to not inherit implementations except for minor tweaks to behaviour. It may work as is for this case, but for a major entity like Node, having an abstract interface would be more flexible, and would make things more testable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The node
implementation needs a refreshing. @wjwwood is currently working on a refactoring which addresses the things you mentioned such as composition vs. inheritance.
} | ||
|
||
LIFECYCLE_EXPORT | ||
~LifecycleNode(){} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
= default
?
{}; | ||
|
||
template<class T> | ||
struct is_manageable_node<T, typename std::enable_if< has_on_activate<T>::value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big yes to doing things this way!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will work on compile time. However, a base class is needed at some point for dynamic runtime loading of nodes. See ros2/demos#84 as an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two ideas are not incompatible.
|
||
LIFECYCLE_EXPORT | ||
virtual bool | ||
on_configure() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the right name for the external API for controlling the state machine.
This implementation mixes life cycle control (when to configure) and node-specific life cycle functionality (such as what to do in the Configuring
transition state/onConfigure()
callback) into a single function.
This function should be called configure
and there should be a separate method, implemented by users who inherit from this class to define their own managed nodes, called onConfigure
. In that function will go all their configuring stuff. In this function will go the life cycle management stuff (which is here already) and any configuration that should be done for every node.
This comment applies to the other transition functions as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I had in mind here is that these on_*
functions can be overridden by users in order to take in place their own implementation. What's generally missing here is an external entry point (such as a LifecycleManager
or similar) which handles commands such as configure
and then calls on_configure
on the respective lifecycle node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the external API is what is missing. Your prototype's example gives the impression that you are treating the external and internal APIs the same.
I think that there should be a node interface that defines the external API. When the node is registered with a LIfecycleManager
or something like that, then it would work with that API. A user could also manage their node's lifecycle directly from their main function, if they wish. Simultaneously, there needs to be a base class with virtual void methods (on_activate
etc) for the implementer of a specific node to override to provide their own functionality.
If you like I can write up a short example, but I think you probably know what I mean and are already working in that direction.
|
||
lc_node->print_state_machine(); | ||
|
||
if (!lc_node->on_configure()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The external API of a managed lifecycle node for controlling transitions between these states is not the callbacks. There should be an API containing the 7 transitions mentioned in the design document (create
, configure
, etc.) When, for example, configure()
is called on the node, it should execute the on_configure()
callback and shift its state to the appropriate successor state based on the result.
See also my comment in the managed lifecycle node implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no external API at this point. I put the callback calls in the talker example for demonstration. I see the talker code here as a lifecycle manager or similar which itself then has a public external API for triggering configure, activate, etc. (also via service calls)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, perhaps "talker" is not a good name for the demonstratation class.
|
||
lc_node->print_state_machine(); | ||
|
||
if (!lc_node->on_configure()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good if we can minimise how much the user has to manually control the node's life cycle. Possibly that will come through another API layered on top, which this talker example should be modified to use when it's done.
* possible transitions registered with this | ||
* state machine. | ||
*/ | ||
typedef struct LIFECYCLE_EXPORT _rcl_state_machine_t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you intend to implement managed lifecycle nodes in rcl as well?
If not, what is the purpose of writing the state machine in C rather than C++, or rather than using something like Boost's state machine library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I indeed wrote this in C for multilanguage purposes such that every rcl* has the same set of states and valid transitions.
I believe ROS2 generally tries not to rely on boost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no opinion on using or avoiding Boost, and I haven't actually looked at the Boost state machine library in quite a while. I just thought I would mention it as an option.
Anyway, I agree with having it in C so other languages can re-use the managed lifecycle as-is.
276a1c6
to
d092184
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the way this is going!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the value in the state machine implementation differentiating between states with on-going actions and those once-only states that occur between the on-going states? From a FSM point of view, there is no difference (one requires a trigger to leave, the other leaves as soon as it has done some particular processing). I think it would simplify the implementation if we did not differentiate between the two, except where we define the triggers that allow the machinery to move to the next state.
So "active" would wait for the triggers "error" or "deactivate" before leaving, but "deactivating" would run the on_deactivate
function and then immediately transition to "error" or "inactive" based on the result.
) | ||
|
||
macro(targets) | ||
if(NOT target_suffix STREQUAL "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition is not necessary and doesn't do what you would expect. The condition is basically always true since target_suffix
is never an empty string but undefined in that case. Just always calling get_rclcpp_information
is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
direct copy from https://github.com/ros2/examples/blob/master/rclcpp_examples/CMakeLists.txt#L30
we should generally define a CMakeLists.txt as an example.
install(TARGETS rcl_lifecycle | ||
ARCHIVE DESTINATION lib | ||
LIBRARY DESTINATION lib | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to install to bin
on Windows / for RUNTIME DESTINATION
.
Maybe we should define the ROS-level interface of the managed node / the lifecycle manager. Similar to what is defined in rcl_interfaces for the parameters. |
@gbiggs I try to replicate basically what's depicted here: http://design.ros2.org/articles/node_lifecycle.html @dirk-thomas Do you propose to generate the c-style structs via message generation? |
I was thinking about the services to control the state machine as well as the events / messages when the state changes. Kind of the external API used for introspection / orchestration. I am not sure if it makes sense for the structs. |
//#define bool int; | ||
// #ifdef __cplusplus | ||
// #error WRONG COMPILER | ||
// #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about these comments?
//loop_rate.sleep(); | ||
++i; | ||
} | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about these comments?
The managed nodes design document only has an ambiguous description of the management interface. I had done some work on a more detailed description but it got put on hold when Tully said he was going to do the node lifecycle implementation soon (because I wanted to see if anything changed). So I dug that work up and finished it off: ros2/design#99 |
@wjwwood I'd like to bring your attention to the last commit 58ae127. Does this go along with what we discussed? Therefore, I'd separate the node into a Then please note that in the current |
58ae127
to
e86473e
Compare
e86473e looks like the right direction to me. |
See lifecycle_talker.cpp as a demo on how a lifecycle orchestration would look like. Status so far:
TODO:
Any thoughts on the latter? |
23783e4
to
4f07abc
Compare
This PR requires #279 to be merged.
|
as discussed offline, we need to figure out why the Could you run CI on this to see if the same behavior shows up on Connext? |
2b42d73
to
b371044
Compare
I refactored this initial PR and extract 3 more dependent PRs on it. lifecycle_msgs: ros2/common_interfaces#24 further depends on two fixes: |
9f4cb84
to
456baa0
Compare
change exception message for windows ci bug
436514a
to
49e9ed3
Compare
* [rcl_lifecycle] remove rosidl deps as this package doesnt generate any messages * depend on rosidl_generator_c
* Update build and test workflow * Update `setup-ros` to 0.0.13 * Update `action-ros-ci` to 0.0.13 Signed-off-by: Zachary Michaels <[email protected]>
up for design review!
Not crustified yet. Not tested yet (except the demo). First prototype of a lifecycle node demo.
rcl_lifecycle
contains a C based state machine with primary states and transitions as depicted in design.ros2.org.rclcpp_lifecycle
contains the higher level c++ implementation of a managed node. A managed node has arcl_state_machine_t
object as member for keeping track of the state and possible transitions. Further, it has the same API as arclcpp::Node
for creating publishers etc. However, each managed node has a handle to all pub/sub/srv/clients in order to activate/deactivate them.lifecycle_talker.cpp
is a first demo application. Missing here is any kind of container which opens services for calling the transition functions such ason_configure()
oron_activate()
, etc.First commits are equal to #258