You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently realized that there is a bug in the current implementation of lifecycle transitions.
However, it's unclear to me what should be the expected behavior, so opening a ticket to discuss.
The problem is the following.
Let's assume that the node is currently in the active state and there is a "deactivate" transition request.
The problem is that, although the transition failed to start, we already updated the state machine so the node is now in the deactivatingstate.
The on_deactivate user callback has not been invoked.
Future requests either to activate or to deactivate will be rejected because they are not valid transitions from this state.
The first easy solution I thought of was to just move the state_machine->current_state = transition->goal; line after publishing the notification.
In this way, if the transition really failed to start, the node would remain in the original state.
The same situation can also happen while finalizing the transition.
However, here we can't stay in the current state (which would be "deactivating"), we can't even go back to the initial one, as we already invoked the user callback.
What's your thought on this problem?
Given the fact that transitions are made of multiple stages and that they all can fail, how should a user deal with this?
Should we always bring the node to the error state?
The text was updated successfully, but these errors were encountered:
I recently realized that there is a bug in the current implementation of lifecycle transitions.
However, it's unclear to me what should be the expected behavior, so opening a ticket to discuss.
The problem is the following.
Let's assume that the node is currently in the active state and there is a "deactivate" transition request.
A lifecycle transition is requested, thus calling
change_state
function https://github.com/ros2/rclcpp/blob/master/rclcpp_lifecycle/src/lifecycle_node_interface_impl.hpp#L379-L448The code will then execute
rcl_lifecycle_trigger_transition_by_id
, trying to transition to thedeactivating
state.In the RCL function
_trigger_transition
https://github.com/ros2/rcl/blob/master/rcl_lifecycle/src/rcl_lifecycle.c#L349-L371 we update the current statestate_machine->current_state = transition->goal;
before publishing notifications.If the call to
rcl_publish
fails (and it can fail) the function will immediately return with an error code and this will result in immediately aborting thechange_state
function call https://github.com/ros2/rclcpp/blob/master/rclcpp_lifecycle/src/lifecycle_node_interface_impl.hpp#L396-L401 with an error along the lines of"Unable to start transition %u from current state %s: Failed to publish
.The problem is that, although the transition failed to start, we already updated the state machine so the node is now in the
deactivating
state.The
on_deactivate
user callback has not been invoked.Future requests either to activate or to deactivate will be rejected because they are not valid transitions from this state.
The first easy solution I thought of was to just move the
state_machine->current_state = transition->goal;
line after publishing the notification.In this way, if the transition really failed to start, the node would remain in the original state.
The same situation can also happen while finalizing the transition.
However, here we can't stay in the current state (which would be "deactivating"), we can't even go back to the initial one, as we already invoked the user callback.
What's your thought on this problem?
Given the fact that transitions are made of multiple stages and that they all can fail, how should a user deal with this?
Should we always bring the node to the error state?
The text was updated successfully, but these errors were encountered: