Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

destroyed nodes can cause Executor to crash #272

Closed
codebot opened this issue Nov 15, 2016 · 3 comments · Fixed by #741
Closed

destroyed nodes can cause Executor to crash #272

codebot opened this issue Nov 15, 2016 · 3 comments · Fixed by #741
Assignees
Labels
bug Something isn't working in review Waiting for review (Kanban column)

Comments

@codebot
Copy link
Member

codebot commented Nov 15, 2016

User error alert! This is probably because I'm still learning C++11 🤕 but I suspect I might not be the only person who tries to do this:

rclcpp::executors::SingleThreadedExecutor executor;
executor.add_node(std::make_shared<MyNodeSubclass>());
executor.spin();

which results in this:

terminate called after throwing an instance of 'std::system_error'
  what():  Invalid argument

with this backtrace:

Thread 1 "minimal_talker" received signal SIGABRT, Aborted.
0x00007ffff6647428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54
54	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff6647428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff664902a in __GI_abort () at abort.c:89
#2  0x00007ffff6c8084d in __gnu_cxx::__verbose_terminate_handler() ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff6c7e6b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007ffff6c7e701 in std::terminate() ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff6c7e919 in __cxa_throw ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff6ca77fe in std::__throw_system_error(int) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x0000000000417d03 in std::mutex::lock() ()
#8  0x0000000000418a36 in std::lock_guard<std::mutex>::lock_guard(std::mutex&)
    ()
#9  0x00007ffff79a4ea8 in GuardCondition::attachCondition(std::mutex*, std::condition_variable*) () from /home/mquigley/ros2/install/lib/librmw_fastrtps_cpp.so
#10 0x00007ffff799d7bc in rmw_wait ()
   from /home/mquigley/ros2/install/lib/librmw_fastrtps_cpp.so
#11 0x00007ffff7767026 in rcl_wait ()
   from /home/mquigley/ros2/install/lib/librcl.so
#12 0x00007ffff7439b3e in rclcpp::executor::Executor::wait_for_work(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) ()
   from /home/mquigley/ros2/install/lib/librclcpp.so
#13 0x00007ffff743ad61 in rclcpp::executor::Executor::get_next_executable(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) ()
   from /home/mquigley/ros2/install/lib/librclcpp.so
#14 0x00007ffff7441f45 in rclcpp::executors::single_threaded_executor::SingleThreadedExecutor::spin() () from /home/mquigley/ros2/install/lib/librclcpp.so
#15 0x0000000000417927 in main ()

I'm not sure what the right answer is. If it's possible to somehow generate a warning and say something like "hey there! I hope you're having a gr8 day. You should probably hold a shared_ptr to that instance on your stack." But that might be hard to generate (actually I have no idea how one could generate that).

@wjwwood wjwwood added enhancement New feature or request help wanted Extra attention is needed labels Feb 9, 2017
@sagniknitr
Copy link
Contributor

Is this issue till open by now ?

@chapulina
Copy link
Contributor

While debugging ros-simulation/gazebo_ros_pkgs#860 with @wjwwood we noticed that the original problem may be another symptom from this same issue. In that case, when the node is destroyed, the executor hangs while detaching a guard condition. Here's a backtrace:

#0  0x00007ffff793610d in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135                                                                            
#1  0x00007ffff792f023 in __GI___pthread_mutex_lock (mutex=0x555555b411d0) at ../nptl/pthread_mutex_lock.c:78                                                                   
#2  0x00007fffc51335a8 in __gthread_mutex_lock (__mutex=0x555555b411d0) at /usr/include/x86_64-linux-gnu/c++/7/bits/gthr-default.h:748                                          
#3  0x00007fffc51335a8 in std::mutex::lock() (this=0x555555b411d0) at /usr/include/c++/7/bits/std_mutex.h:103                                                                   
#4  0x00007fffc51335a8 in std::lock_guard<std::mutex>::lock_guard(std::mutex&) (__m=..., this=<synthetic pointer>) at /usr/include/c++/7/bits/std_mutex.h:162                   
#5  0x00007fffc51335a8 in GuardCondition::detachCondition() (this=0x555555b411d0) at ./src/types/guard_condition.hpp:61                                                         
#6  0x00007fffc51335a8 in rmw_fastrtps_shared_cpp::__rmw_wait(rmw_subscriptions_t*, rmw_guard_conditions_t*, rmw_services_t*, rmw_clients_t*, rmw_wait_set_t*, rmw_time_t const*
) (subscriptions=<optimized out>, guard_conditions=0x555556559110, services=<optimized out>, clients=<optimized out>, wait_set=<optimized out>, wait_timeout=<optimized out>)   
    at ./src/rmw_wait.cpp:210                                                                                                                                                   
#7  0x00007ffff55d37d9 in rcl_wait (wait_set=wait_set@entry=0x555556558ef0, timeout=timeout@entry=-1) at ./src/rcl/wait.c:549                                                   
#8  0x00007ffff71d9d07 in rclcpp::executor::Executor::wait_for_work(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) (this=this@entry=0x555556558ed0, timeout=...,    
    timeout@entry=...) at ./src/rclcpp/executor.cpp:448                                                                                                                         
#9  0x00007ffff71da5c5 in rclcpp::executor::Executor::get_next_executable(rclcpp::executor::AnyExecutable&, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) (this=thi
s@entry=0x555556558ed0, any_executable=..., timeout=timeout@entry=...) at ./src/rclcpp/executor.cpp:576                                                                         
#10 0x00007ffff71ddb0f in rclcpp::executors::MultiThreadedExecutor::run(unsigned long) (this=this@entry=0x555556558ed0)                                                         
    at ./src/rclcpp/executors/multi_threaded_executor.cpp:80                                                                                                                    
#11 0x00007ffff71de16d in rclcpp::executors::MultiThreadedExecutor::spin() (this=0x555556558ed0) at ./src/rclcpp/executors/multi_threaded_executor.cpp:58                       
#12 0x00007ffff7b9e07c in gazebo_ros::Executor::run() (this=0x555556558ed0) at /home/developer/gazebo_ros_pkgs/gazebo_ros/src/executor.cpp:45                                   
#13 0x00007ffff7ba36e9 in std::__invoke_impl<void, void (gazebo_ros::Executor::*&)(), gazebo_ros::Executor*&>(std::__invoke_memfun_deref, void (gazebo_ros::Executor::*&)(), gaz
ebo_ros::Executor*&) (__f=@0x555557592b08: (void (gazebo_ros::Executor::*)(gazebo_ros::Executor * const)) 0x7ffff7b9e02e <gazebo_ros::Executor::run()>, __t=@0x555557592b18: 0x5
55556558ed0) at /usr/include/c++/7/bits/invoke.h:73                                                                                                                             
#14 0x00007ffff7ba2e66 in std::__invoke<void (gazebo_ros::Executor::*&)(), gazebo_ros::Executor*&>(void (gazebo_ros::Executor::*&)(), gazebo_ros::Executor*&) (__fn=@0x555557592
b08: (void (gazebo_ros::Executor::*)(gazebo_ros::Executor * const)) 0x7ffff7b9e02e <gazebo_ros::Executor::run()>, __args#0=@0x555557592b18: 0x555556558ed0)                     
    at /usr/include/c++/7/bits/invoke.h:95                                                                                                                                      
#15 0x00007ffff7ba2661 in std::_Bind<void (gazebo_ros::Executor::*(gazebo_ros::Executor*))()>::__call<void, , 0ul>(std::tuple<>&&, std::_Index_tuple<0ul>) (this=0x555557592b08,
---Type <return> to continue, or q <return> to quit---                                                                                                                          
 __args=...) at /usr/include/c++/7/functional:467                                                                                                                               
#16 0x00007ffff7ba1a39 in std::_Bind<void (gazebo_ros::Executor::*(gazebo_ros::Executor*))()>::operator()<, void>() (this=0x555557592b08) at /usr/include/c++/7/functional:551  
#17 0x00007ffff7ba11a5 in std::__invoke_impl<void, std::_Bind<void (gazebo_ros::Executor::*(gazebo_ros::Executor*))()>>(std::__invoke_other, std::_Bind<void (gazebo_ros::Execut
or::*(gazebo_ros::Executor*))()>&&) (__f=...) at /usr/include/c++/7/bits/invoke.h:60                                                                                            
#18 0x00007ffff7ba03f0 in std::__invoke<std::_Bind<void (gazebo_ros::Executor::*(gazebo_ros::Executor*))()>>(std::_Bind<void (gazebo_ros::Executor::*(gazebo_ros::Executor*))()>
&&) (__fn=...) at /usr/include/c++/7/bits/invoke.h:95                                                                                                                           
#19 0x00007ffff7ba4ba4 in std::thread::_Invoker<std::tuple<std::_Bind<void (gazebo_ros::Executor::*(gazebo_ros::Executor*))()> > >::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this
=0x555557592b08) at /usr/include/c++/7/thread:234                                                                                                                               
#20 0x00007ffff7ba49e4 in std::thread::_Invoker<std::tuple<std::_Bind<void (gazebo_ros::Executor::*(gazebo_ros::Executor*))()> > >::operator()() (this=0x555557592b08)          
    at /usr/include/c++/7/thread:243                                                                                                                                            
#21 0x00007ffff7ba472a in std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::_Bind<void (gazebo_ros::Executor::*(gazebo_ros::Executor*))()> > > >::_M_run() (this=0x
555557592b00) at /usr/include/c++/7/thread:186                                                                                                                                  
#22 0x00007ffff6c7a57f in  () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6                                                                                                       
#23 0x00007ffff792c6db in start_thread (arg=0x7fffc2e4c700) at pthread_create.c:463                                                                                             
#24 0x00007ffff633788f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 

@dirk-thomas dirk-thomas added bug Something isn't working and removed enhancement New feature or request help wanted Extra attention is needed labels May 24, 2019
@dirk-thomas dirk-thomas self-assigned this May 24, 2019
@dirk-thomas dirk-thomas added the in progress Actively being worked on (Kanban column) label May 24, 2019
@dirk-thomas
Copy link
Member

While this logic takes care of removing expired weak pointers from weak_nodes_:

// Clean up any invalid nodes, if they were detected
if (has_invalid_weak_nodes) {
weak_nodes_.erase(
remove_if(
weak_nodes_.begin(), weak_nodes_.end(),
[](rclcpp::node_interfaces::NodeBaseInterface::WeakPtr i)
{
return i.expired();
}
)
);
}

it doesn't remove the guard condition associated with the node though. In the remove_node method that is taken care of in this line:

memory_strategy_->remove_guard_condition(node_ptr->get_notify_guard_condition());

As a consequence the wait set allocated in rcl_wait_set_resize contains an extra guard condition which makes add_handles_to_wait_set return false.

@dirk-thomas dirk-thomas added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels May 24, 2019
nnmm pushed a commit to ApexAI/rclcpp that referenced this issue Jul 9, 2022
* initialize timer with clock

* use rcl_clock_get_now

* call rcl_clock_fini at the end of each test

* modify rcl_clock_get_now to take a rcl_time_point_value_t

* update docblock

* update to pass time_point_value

* add check for NULL

* add rcl_timer_clock()

* fix style

* doc fixes

* fini clock
DensoADAS pushed a commit to DensoADAS/rclcpp that referenced this issue Aug 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working in review Waiting for review (Kanban column)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants