Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use an internal implementation of boost::condition_variable with monotonic clock #1932

Merged
merged 3 commits into from
Jul 28, 2020

Conversation

meyerj
Copy link
Contributor

@meyerj meyerj commented Apr 15, 2020

Follow-up on and partially revert #1878:

Unfortunately @ahoarau was right in #1878 (comment): The patch did not solve the issue with busy-wait spinning or non-functional timers completely, because the definition of boost::condition_variable() provided by different compilation units might still have been compiled with non-matching settings of BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC. So even with the patch the eventually conflicting definitions of the symbol is an ODR violation and can cause undefined behavior, depending on the compiler internals and flags.

For example, libtf2_ros.so from latest ros-melodic-tf2-ros provides a weak symbol boost::condition_variable() which was compiled without the macro BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC. Therefore, as soon as libtf2_ros.so is linked into the process all condition variables are constructed with the standard system clock as their internal clock. The wait calls within roscpp, e.g. for timers, which expect that condition variables are configured with CLOCK_MONOTONIC internally, become no-ops then.

While the problem has finally been solved in Boost 1.67 and the remaining checks for older Boost versions have been completely removed in #1903 for noetic, I propose to partially revert #1878 for melodic and older, and to re-add a custom stripped down version of boost::condition_variable as ros::internal::condition_variable_monotonic. This is more or less what I proposed in my earlier pull request #1651, but rebased on the latest melodic-devel branch and simplified. The copy uses a different namespace and does not respect BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC. It uses the monotonic clock unconditionally. This custom implementation is only selected for platforms that support it and if BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC or BOOST_THREAD_INTERNAL_CLOCK_IS_MONO are not already defined by Boost itself (and therefore would be effective for all compilation units). Otherwise ros::internal::condition_variable_monotonic is an alias for boost::condition_variable:

#if !defined(BOOST_THREAD_PLATFORM_PTHREAD) || \
    defined(BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC) || \
    defined(BOOST_THREAD_INTERNAL_CLOCK_IS_MONO)
using condition_variable_monotonic = boost::condition_variable;

#else

class condition_variable_monotonic {
// ...
};
#endif

@dirk-thomas
Copy link
Member

@ros-pull-request-builder retest this please

@dirk-thomas
Copy link
Member

@meyerj I am fine merging this patch. But I wanted to double check how it relates to #1684 and if both are trying to address the same problem? If yes, is merging one of the patches sufficient and if that is the case which one should be preferred and why?

@dirk-thomas dirk-thomas merged commit f781def into ros:melodic-devel Jul 28, 2020
@dirk-thomas dirk-thomas mentioned this pull request Jul 28, 2020
meyerj added a commit to meyerj/ros_comm that referenced this pull request Jul 31, 2020
…tonic clock (ros#1932)

* Use an internal implementation of condition_variable with monotonic clock to avoid ODR violation

* Fix build of ros/internal/condition_variable.h for Boost <1.65

* Fix "static_assert with no message is a C++17 extension" warning in ros/internal/condition_variable.h
dirk-thomas added a commit that referenced this pull request Aug 3, 2020
…tonic clock [kinetic-devel] (#2011)

* Drop custom implementation of boost::condition_variable to fix busy-wait spinning (#1878)

* roscpp: fix potential busy-wait loop caused by backported Boost condition_variable (fix #1343)

#1014 and #1250 introduced a backported
version of boost::condition_variable, where support for steady (monotonic) clocks has been added in version 1.61.
But the namespace of the backported version was not changed and the symbol names might clash with the original
version.

Because the underlying clock used for the condition_variable is set in the constructor and must be
consistent with the the expectations within member variables. The compiler might choose to inline one or the
other or both, and is more likely to do so for optimized Release builds. But if it does not, the symbol ends
up in the symbol table of roscpp and depending on which other libraries will be linked into the process it
is unpredictable which of the two versions will be actually called at the end. In case the constructor defined
in `/usr/include/boost/thread/pthread/condition_variable.hpp` was called and did not configure the internal
pthread condition variable for monotonic clock, each call to the backported do_wait_until() method with a
monotonic timestamp will return immediately and hence causes `CallbackQueue::callOne(timeout)` or
`CallbackQueue::callAvailable(timeout)` to return immediately.

This patch changes the namespace of the backported condition_variable implementation to boost_161. This
removes the ambiguity with the original definition if both are used in the same process.

* roscpp: use boost::condition_variable::wait_for() instead of deprecated timed_wait()

This fixes ROS timers in combination with 2c18b9f. The timer
callbacks were not called because the TimerManager's thread function blocked indefinitely on
boost::condition_variable::timed_wait().

Relative timed_wait() uses the system clock (boost::get_system_time()) unconditionally to
calculate the absolute timestamp for do_wait_until(). If the condition variable has been
initialized with BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC, it compares this timestamp
with the monotonic clock and therefore blocks.

This issue has been reported in https://svn.boost.org/trac10/ticket/12728 and will not be
fixed. The timed_wait interface is apparently deprecated.

* roscpp: do not use boost_161_condition_variable.h on Windows (untested)

* roscpp: remove specialized implementation of TimerManager<T,D,E>::threadFunc() in steady_timer.cpp

The updated generic definition in timer_manager.h should do the same with a minor update.
In all cases we can call boost::condition_variable::wait_until() with an absolute time_point of the respective clock.
The conversion from system_clock to steady_clock for Time and WallTime is done internally in
boost::condition_variable::wait_until(lock_type& lock, const chrono::time_point<Clock, Duration>& t).

* fix namespaces

* add more explicit namespaces

* add missing ns

* roscpp: fixed Boost version check in CMakeLists.txt

find_package(Boost) has to come before checking the Boost version.
Otherwise BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC was not defined which
triggered the assertion in timer_manager.h:240.

Since Boost 1.67 BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC became the default
if the platform supports it and the macro is not defined anymore. Instead, check
for BOOST_THREAD_INTERNAL_CLOCK_IS_MONO.

* roscpp: replace ROSCPP_BOOST_CONDITION_VARIABLE and ROSCPP_BOOST_CONDITION_VARIABLE_HEADER macros by a typedef in internal_condition_variable.h

* Remove copy of boost::condition_variable implementation from Boost 1.61 in namespace boost_161

* Revert some changes in include directives and in CMakeLists.txt to minimize the diff to melodic-devel

Addresses #1878 (review).

* use wait_for(), remove TimerManagerTraits

* Revert "use wait_for(), remove TimerManagerTraits"

This reverts commit 2a67cf6.

Co-authored-by: Antoine Hoarau <[email protected]>
Co-authored-by: Dirk Thomas <[email protected]>

* Use an internal implementation of boost::condition_variable with monotonic clock (#1932)

* Use an internal implementation of condition_variable with monotonic clock to avoid ODR violation

* Fix build of ros/internal/condition_variable.h for Boost <1.65

* Fix "static_assert with no message is a C++17 extension" warning in ros/internal/condition_variable.h

* roscpp: fix ros/internal/condition_variable.h for pre-C++11 compilers

Co-authored-by: Antoine Hoarau <[email protected]>
Co-authored-by: Dirk Thomas <[email protected]>
@wolfv
Copy link

wolfv commented Aug 4, 2020

I think this does not compile on OS X unfortunately: http://robostack.net:8000/RoboStack/vinca/94/2/2

@wolfv
Copy link

wolfv commented Aug 4, 2020

In file included from $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/rosout_appender.h:41:
19518 | $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/internal/condition_variable.h:71:7: error: use of undeclared identifier 'pthread_condattr_setclock'; did you mean 'pthread_condattr_setpshared'?
19519 | pthread_condattr_setclock(&attr, CLOCK_MONOTONIC);
19520 | ^~~~~~~~~~~~~~~~~~~~~~~~~
19521 | pthread_condattr_setpshared
19522 | /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk/usr/include/pthread.h:328:5: note: 'pthread_condattr_setpshared' declared here
19523 | int pthread_condattr_setpshared(pthread_condattr_t *, int);
19524 | ^
19525 | In file included from $SRC_DIR/ros-melodic-roscpp/src/work/src/libros/subscriber.cpp:30:
19526 | In file included from $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/topic_manager.h:34:
19527 | In file included from $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/rosout_appender.h:41:
19528 | $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/internal/condition_variable.h:142:34: error: no member named 'to_timespec' in namespace 'boost::detail'
19529 | timespec ts = boost::detail::to_timespec(d);
19530 |  
19531 | ~~~~~~~~~~~~~~~^
19532 | $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/internal/condition_variable.h:160:30: error: no member named 'check' in 'boost::detail::interruption_checker'
19533 | check_for_interruption.check();
19534 |  
19535 | ~~~~~~~~~~~~~~~~~~~~~~ ^
19536 | $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/internal/condition_variable.h:190:30: error: no member named 'check' in 'boost::detail::interruption_checker'
19537 | check_for_interruption.check();
19538 | ~~~~~~~~~~~~~~~~~~~~~~ ^
19539 | 4 errors generated.
19540 | [32/77] Building CXX object CMakeFiles/roscpp.dir/src/libros/master.cpp.o
19541 | [33/77] Building CXX object CMakeFiles/roscpp.dir/src/libros/service_publication.cpp.o
19542 | [34/77] Building CXX object CMakeFiles/roscpp.dir/src/libros/wall_timer.cpp.o
19543 | FAILED: CMakeFiles/roscpp.dir/src/libros/wall_timer.cpp.o
19544 |  
19545 | $BUILD_PREFIX/bin/x86_64-apple-darwin13.4.0-clang++ -DBOOST_ALL_NO_LIB -DBOOST_CHRONO_DYN_LINK -DBOOST_FILESYSTEM_DYN_LINK -DBOOST_SYSTEM_DYN_LINK -DROSCONSOLE_BACKEND_LOG4CXX -DROS_BUILD_SHARED_LIBS=1 -DROS_PACKAGE_NAME=\"roscpp\" -Droscpp_EXPORTS -Idevel/include -I$SRC_DIR/ros-melodic-roscpp/src/work/include -Idevel/include/ros -I$PREFIX/share/xmlrpcpp/cmake/../../../include/xmlrpcpp -I. -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -stdlib=libc++ -fvisibility-inlines-hidden -std=c++14 -fmessage-length=0 -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/ros-melodic-roscpp-1.14.7 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix -DBOOST_ERROR_CODE_HEADER_ONLY -O3 -DNDEBUG -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -mmacosx-version-min=10.9 -fPIC -Wall -Wextra -MD -MT CMakeFiles/roscpp.dir/src/libros/wall_timer.cpp.o -MF CMakeFiles/roscpp.dir/src/libros/wall_timer.cpp.o.d -o CMakeFiles/roscpp.dir/src/libros/wall_timer.cpp.o -c $SRC_DIR/ros-melodic-roscpp/src/work/src/libros/wall_timer.cpp
19546 | In file included from $SRC_DIR/ros-melodic-roscpp/src/work/src/libros/wall_timer.cpp:29:
19547 | In file included from $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/timer_manager.h:41:
19548 |  
19549 | $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/internal/condition_variable.h:71:7: error: use of undeclared identifier 'pthread_condattr_setclock'; did you mean 'pthread_condattr_setpshared'?
19550 | pthread_condattr_setclock(&attr, CLOCK_MONOTONIC);
19551 | ^~~~~~~~~~~~~~~~~~~~~~~~~
19552 | pthread_condattr_setpshared
19553 |  
19554 | /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk/usr/include/pthread.h:328:5: note: 'pthread_condattr_setpshared' declared here
19555 | int pthread_condattr_setpshared(pthread_condattr_t *, int);
19556 | ^
19557 | In file included from $SRC_DIR/ros-melodic-roscpp/src/work/src/libros/wall_timer.cpp:29:
19558 | In file included from $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/timer_manager.h:41:
19559 | $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/internal/condition_variable.h:142:34: error: no member named 'to_timespec' in namespace 'boost::detail'
19560 | timespec ts = boost::detail::to_timespec(d);
19561 | ~~~~~~~~~~~~~~~^
19562 | $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/internal/condition_variable.h:160:30: error: no member named 'check' in 'boost::detail::interruption_checker'
19563 | check_for_interruption.check();
19564 | ~~~~~~~~~~~~~~~~~~~~~~ ^
19565 | $SRC_DIR/ros-melodic-roscpp/src/work/include/ros/internal/condition_variable.h:190:30: error: no member named 'check' in 'boost::detail::interruption_checker'
19566 | check_for_interruption.check();
19567 | ~~~~~~~~~~~~~~~~~~~~~~ ^
19568

@dirk-thomas
Copy link
Member

I think this does not compile on OS X unfortunately

@meyerj Can you take a look at this and work on a pull request to address the regression?

@dirk-thomas
Copy link
Member

I created #2019 to not forget about these comments on a closed ticket.

boost::unique_lock<boost::mutex> &lock,
const boost::chrono::time_point<Clock, Duration> &t)
{
using namespace boost::chrono;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these instances of using namespace boost::chrono in these header functions are causing build failures in my downstream packages, because it introduces potential name collisions with the same types in std::chrono which are included indirectly as well by the packages

Though this is probably an indicator of poor using-hygiene on the part of one of my dependencies, I'd still say it's a regression in Melodic. Is there any reason this code must use using, or could it call boost::chrono::steady_clock etc in these functions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason this code must use using, or could it call boost::chrono::steady_clock etc in these functions?

I don't think there is technical reason why using was used. Please consider to create a pull request to change it to the fully qualified types and get rid of using namespace in the header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - I've opened #2020 to follow up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ran into the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants