-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix nodes spins rather than blocking and waiting, using 100% CPU [kinetic] #1557
Conversation
…tion_variable (fix ros#1343) ros#1014 and ros#1250 introduced a backported version of boost::condition_variable, where support for steady (monotonic) clocks has been added in version 1.61. But the namespace of the backported version was not changed and the symbol names might clash with the original version. Because the underlying clock used for the condition_variable is set in the constructor and must be consistent with the the expectations within member variables. The compiler might choose to inline one or the other or both, and is more likely to do so for optimized Release builds. But if it does not, the symbol ends up in the symbol table of roscpp and depending on which other libraries will be linked into the process it is unpredictable which of the two versions will be actually called at the end. In case the constructor defined in `/usr/include/boost/thread/pthread/condition_variable.hpp` was called and did not configure the internal pthread condition variable for monotonic clock, each call to the backported do_wait_until() method with a monotonic timestamp will return immediately and hence causes `CallbackQueue::callOne(timeout)` or `CallbackQueue::callAvailable(timeout)` to return immediately. This patch changes the namespace of the backported condition_variable implementation to boost_161. This removes the ambiguity with the original definition if both are used in the same process.
…ed timed_wait() This fixes ROS timers in combination with 2c18b9f. The timer callbacks were not called because the TimerManager's thread function blocked indefinitely on boost::condition_variable::timed_wait(). Relative timed_wait() uses the system clock (boost::get_system_time()) unconditionally to calculate the absolute timestamp for do_wait_until(). If the condition variable has been initialized with BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC, it compares this timestamp with the monotonic clock and therefore blocks. This issue has been reported in https://svn.boost.org/trac10/ticket/12728 and will not be fixed. The timed_wait interface is apparently deprecated.
confirmed #1343 is fixed with this patch. |
Thanks a lot for digging into this! |
*/ | ||
void setPeriod(const WallDuration& period, bool reset=true); | ||
|
||
bool hasStarted() const { return impl_->hasStarted(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The corresponding Impl::hasStarted()
seem to be still missing in steady_timer.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed in c37a75b.
Maybe I never ran into any issues like this, because I'm always using Release builds and then the SteadyTimer works as expected.
Melodic (or rather Ubuntu bionic) already has boost 1.65, so at least the problem with the backported headers should not arise there. |
I overlooked the specialization of However, the code duplication is not required anymore now that the general definition of |
I guess it would actually make sense to create a separate PR for the added |
…eadFunc() in steady_timer.cpp The updated generic definition in timer_manager.h should do the same with a minor update. In all cases we can call boost::condition_variable::wait_until() with an absolute time_point of the respective clock. The conversion from system_clock to steady_clock for Time and WallTime is done internally in boost::condition_variable::wait_until(lock_type& lock, const chrono::time_point<Clock, Duration>& t).
Done: #1565 |
Result of running abi-compliance-checker on As expected, the type of two condition variable members has been changed, but because the memory layout did not, this should not be a problem. |
Regarding the failing unit test:
(http://build.ros.org/job/Kpr__ros_comm__ubuntu_xenial_amd64/464) I could reproduce the same failure locally when running with Ah, kinetic is missing the patch from 507299f. Are there plans to backport bug fixes from lunar or melodic to kinetic or even indigo? |
Please retarget the @meyerj Can you please check #1608 and comment how this patch compares to the other one. |
@meyerj Friendly ping. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Does not seem to like the PR with a c++14 library (unique_lock from boost/std)
|
This PR is against ROS Kinetic, which does not officially support C++14, if I'm not mistaken |
c++11 causes the same problems. Very identical compatibility |
I removed all the using namespace, and now it builds fine. |
@meyerj Another friendly ping. If this should be included in an upcoming release please consider rebasing this against |
fix namespaces
I merged the patch from @ahoarau in meyerj#1, which seems to solve the C++11/C++14 compatibility issues according to #1557 (comment). Thanks for the patch! Unfortunately I do not have the time at the moment to test this myself or to compare with the solution proposed in #1608.
I opened another PR against However, if melodic normally demands at least Boost 1.62 according to REP-3, it might not be necessary to apply the patch there and we could rather remove the Boost 1.61 backports only required for older versions completely. See my comment in the new PR. |
…n timer_manager.h Since Boost 1.67 BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC became the default if the platform supports it and the macro is not defined anymore. Instead, check for BOOST_THREAD_INTERNAL_CLOCK_IS_MONO.
…ITION_VARIABLE_HEADER macros by a typedef in internal_condition_variable.h
We can close this in favor of #1651 |
See #2011 for a newer patch targeting |
Fixes #1343, and other potential issues.
Most likelySteadyTimer
never has been steady.As commented in the original issue, there is no guarantee that the compiler will inline the calls to
boost::condition_variable
constructor,boost::condition_variable::timed_wait()
andboost::condition_variable::wait_for()
and it might or might not use the backported version from Boost 1.61 introduced in #1014. The namespace has not been changed and even within roscpp some compilation units used the patched and some used the unpatched implementation (by including or not includingboost_161_condition_variable.h
). If those methods are mixed, because one gets inlined and the other not, or because the same instance is used in different compilation units with different includes and preprocessor macros defined, the underlying pthread condition variable gets initialized withCLOCK_MONOTONIC
but is called with system clock timestamps or vice-versa. This situation ultimately leads to the 100% busy-wait problem inCallbackQueue::callAvailable()
as reported in #1343, if theCallbackQueue
constructor for whatever reason calls theboost::condition_variable
constructor from a compilation unit whereBOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC
was not defined.The first patch resolves this link ambiguity by changing the namespace of the included
condition_variable
andcondition_variable_any
implementations toboost_161
. New preprocessor macrosROSCPP_BOOST_CONDITION_VARIABLE_HEADER
andROSCPP_BOOST_CONDITION_VARIABLE
have been introduced and exposed inros/common.h
to select the implementation whenroscpp
is compiled. Because the memory layout does not change, I would carefully claim that the patch is fully ABI compatible. I copiedlibroscpp.so
compiled from a patched workspace to/opt/ros/kinetic/lib
and used it as a drop-in replacement over the past days, without having to recompile any other package installed from debians.The second patch, 56ecdfa, fixes an issue with ROS timers that popped up in combination with the first. The
boost::condition_variable::timed_wait()
methods are considered deprecated and usesboost::get_system_time()
(system clock) unconditionally to calculate the absolute timestamp here - as called by TimerManager::threadFunc(). If the condition variable would have been constructed as expected, with the implementation from Boost 1.61 or above andBOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC
defined, this call blocks the thread almost indefinitely and breaks timers (failing unit tests inroscpp_tests
) - the inverse problem. It has already been reported here and marked as wontfix. The solution is to switch to the Boost Chrono interface as already done in #1250 for ros::CallbackQueue. I have no good explanation why this problem has not been triggered before. It only makes sense if the backported constructor ofboost::condition_variable
would have never been called because of the order of source files in theroscpp
library and henceSteadyTimer
never has been fully steady (#1014). Any better ideas?Last but not least, fdca164 applies the patch from #1464 to the other timer types and adds(replaced by #1565)WallTimer::hasStarted()
andSteadyTimer::hasStarted()
, just for completeness and to match the three interfaces except for the time, duration and event types.To be absolutely certain that
TimerManager<T, D, E>
will never be instantiated outside of a compilation unit that definesBOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC
(see timer_manager.h:215), I suggest to move the implementation of the constructor and member functions to a source filetimer_manager.cpp
with explicit template instantiation for the three timer types. I tested this approach to exclude thatTimerManager<T, D, E>
and the internal condition variable is constructed elsewhere, without the definition of BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC. But at the end I reverted that patch because this was not the case. An alternative, but more intrusive solution would be to move thetimer_manager.h
header to thesrc
folder and to not expose it at all.Sorry for targeting
kinetic-devel
although it should probably bemelodic-devel
, but this is the version I have been working on. In general, the patch applies to all versions since kinetic (not tested yet).