Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with ROS2 version of spdlog #14220

Closed
hidmic opened this issue Oct 16, 2020 · 23 comments · Fixed by #14427
Closed

Compatibility with ROS2 version of spdlog #14220

hidmic opened this issue Oct 16, 2020 · 23 comments · Fixed by #14427
Assignees

Comments

@hidmic
Copy link
Contributor

hidmic commented Oct 16, 2020

Likely related to #7451. Not much of an issue until one tries to use drake::log() on a binary that depends on another library that uses a different spdlog version.

Context

Using ROS 2 Eloquent and latest Drake binaries on Ubuntu Bionic 18.04, (strong) spdlog symbols brought by libdrake_spdlog.so interact with (weak) spdlog symbols brought by librcl_logging_spdlog.so. Eventually, it causes a segfault within rcl_logging_spdlog.
Arguably, neither library attempts to mitigate this risk.

How to reproduce

See spdlog_pollution_repro: branch | commit.

@hidmic
Copy link
Contributor Author

hidmic commented Oct 16, 2020

CC @IanTheEngineer @EricCousineau-TRI

@jwnimmer-tri
Copy link
Collaborator

Right, it's not supported nor possible to mix and match binary compilations of two different versions spdlog in the same program. You need to compile everything using the same source version. The pre-compiled Drake images are probably not suitable for your use case.

@jwnimmer-tri jwnimmer-tri added component: build system Bazel, CMake, dependencies, memory checkers, linters configuration: linux type: feature request labels Oct 16, 2020
@jwnimmer-tri jwnimmer-tri self-assigned this Oct 16, 2020
@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented Oct 16, 2020

If I recall correctly, another difference you'll find is that the ros2 bundled spdlog uses spdlog's internal vendored fmt, which is a different version that Drake uses, leading to the same kinds of problems. The patch to Drake to disable spdlog entirely would be small (already present in the Bazel build, we'd just need to thread it through to the CMake build), but we can't disable fmt, so there's still an incompatibility lurking.

Ultimately, the build system needs to use the same libraries and headers for any public library dependencies when linked into the same program.

For things like libsdformat, Drake only uses it internally, so we compile it as a static archive with hidden symbols, so it will not cause a conflict.

For the things listed in #7451, the entire program needs to agree on one single version. Especially for widely-used things like fmt and Eigen, I've been thinking we could offer in the Drake CMake build the option to find_package() for those public dependencies, so users could opt-in to libraries provided elsewhere.

@jwnimmer-tri jwnimmer-tri changed the title //common/text_logging pollutes global namespace with spdlog symbols Compatibility with ROS2 version of spdlog Oct 16, 2020
@jwnimmer-tri
Copy link
Collaborator

From #7451, there are three public libraries that we compile from source within Drake: fmt, lcm, and spdlog. Allowing a CMake user to rebuild Drake while supplying their own compiled versions of those libraries would resolve this request, I think.

@hidmic
Copy link
Contributor Author

hidmic commented Oct 19, 2020

@jwnimmer-tri would this also be possible when Drake is an external dependency of a Bazel workspace? For a scenario where a binary ROS 2 distribution is pulled into that Bazel workspace as an external dependency.

@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented Oct 19, 2020

In Bazel, the user already has full control of externals:

When calling Drake's add_default_repositories users can opt-out of any dependencies that they have already provided in their own WORKSPACE, by using, e.g., excludes = ["spdlog"]:

def add_default_repositories(excludes = [], mirrors = DEFAULT_MIRRORS):
"""Declares workspace repositories for all externals needed by drake (other
than those built into Bazel, of course). This is intended to be loaded and
called from a WORKSPACE file.
Args:
excludes: list of string names of repositories to exclude; this can
be useful if a WORKSPACE file has already supplied its own external
of a given name.

Alternatively, if the user wants to opt-in to Drake dependencies one by one instead of ingesting Drake's entire WORKSPACE, they should not call add_default_repositories but instead call, e.g., fcl_repository(name = "fcl", mirrors = mirrors) one by one to accept Drake's defaults for specific repositories (or else provide their own, when desired).

if "fcl" not in excludes:
fcl_repository(name = "fcl", mirrors = mirrors)

@EricCousineau-TRI
Copy link
Contributor

To clarify, this use case is actually Anzu consuming ROS2. Would that change the considerations here?

(Jeremy, I can page you in on more details if you want.)

@jwnimmer-tri
Copy link
Collaborator

Yes.

To obtain compatibility of Anzu with the ROS2 ...

Option 1 is to consume ROS2 as source and compile it ourselves. This could be a lot of work, but could also be the best way to go. In general, building from source will always be the most robust option when dealing with C++ code we intend to link into the same binary as our own source code. We've shirked this with our ROS1 integration because of its minimal use, but if we are planning on non-trivial runtime integration, the investment to rebuild from sources could pay off nicely.

Option 2 is to stick with the ROS2 precompiled binaries. For that we'd need to use the ROS2 precompiled spdlog. That would currently be a problem because ROS2 appears to be configured to use header-only spdlog, and that severely bloats our binaries, which is a regression I want to avoid. If we can get ROS2 to upgrade to a modern spdlog (it seems to have that on master, just not the eloquent branch) and then also enable the shared library, then we could point Anzu to use that pre-compiled spdlog.

@EricCousineau-TRI
Copy link
Contributor

Aye. Mich and @sloretz have an Option 3 as well, where we leverage the rcl_logging C interface, intercept it at flag-parsing time, and can tell to it to use Drake's version of spdlog (or just disable logging entirely).

I think we'll try out Option 3 for prototyping, but will keep in mind the above alternatives you mentioned.

@jamiesnape
Copy link
Contributor

Some of the discussion in #13698 partially relates this, but the conclusion there was, I think I am correct in saying, that we knew this was going to bite us at some point.

@jwnimmer-tri
Copy link
Collaborator

I don't disagree with any of the above.

Given how release schedules work though, I would like to advocate for investing in getting Ubuntu and ROS cleaned up as much as possible before their next release(s) start the process freezing things and locking in versions.

If ROS could use the Ubuntu versions of spdlog and fmt, that would go a long way towards everyone agreeing on what version (ABI) to use for those libraries. Drake could move to doing the same.

We'd like to have the newest version of them made available in the releases, and also change spdlog config to use the shared library -- the header-only is vastly worse on build times and footprint. If we can get all of that work into the pipeline for 21.04, that would be a great starting point.

@EricCousineau-TRI
Copy link
Contributor

If we can get all of that work into the pipeline for 21.04, [...]

Not sure if I understand this part - why even consider a non-LTS release? Do we have plans of having Drake support non-LTS?
Or are you saying for the ROS ecosystem in isolation?

@jamiesnape
Copy link
Contributor

jamiesnape commented Nov 5, 2020

The sooner you get it into an Ubuntu release, the better. You have work with upstream, Debian, and Ubuntu developers. If you wait until 22.04 there is a good chance you will miss it (you really need fixes to be in by 21.10).

@EricCousineau-TRI
Copy link
Contributor

Gotcha! Makes sense in terms of focusing non-LTS work on ROS, nothing directly on Drake (for non-LTS).

@jamiesnape
Copy link
Contributor

Note as It happens, one of the issues, the vendored libfmt has finally been picked up from Debian and will be in groovy.

Gotcha! Makes sense in terms of focusing non-LTS work on ROS, nothing directly on Drake (for non-LTS).

Sooner is always better, even for Drake, it just depends on the scale of what is needed. If something new needs packaging, it can sometimes take a year to do so. Fixes are a little less time sensitive, but maintainers are volunteers, and sometimes the same version just rolls over because the maintainer was pressed for time.

@hidmic
Copy link
Contributor Author

hidmic commented Nov 20, 2020

@jamiesnape @jwnimmer-tri to recap,

Mich and @sloretz have an Option 3 as well, where we leverage the rcl_logging C interface, intercept it at flag-parsing time, and can tell to it to use Drake's version of spdlog (or just disable logging entirely).

Our current workaround is somewhat different. ROS 2 builds strip RPATHs by default, so we basically inject an librcl_logging_spdlog.so substitute that uses drake::log() under the hood.

If ROS could use the Ubuntu versions of spdlog and fmt, that would go a long way towards everyone agreeing on what version (ABI) to use for those libraries. Drake could move to doing the same.

This is already the case for ROS 2 Foxy packages running on Ubuntu Focal 20.04 (and MacOS Mojave 10.14). You can see here that spdlog is vendored iff it's not present in the system. It also uses spdlog in its shared library form.

We'd like to have the newest version of them made available in the releases, and also change spdlog config to use the shared library -- the header-only is vastly worse on build times and footprint. If we can get all of that work into the pipeline for 21.04, that would be a great starting point.

The sooner you get it into an Ubuntu release, the better. You have work with upstream, Debian, and Ubuntu developers. If you wait until 22.04 there is a good chance you will miss it (you really need fixes to be in by 21.10).

Got it. Specifically, ensure spdlog 1.8.1 and fmt 7.1.0 get into Ubuntu 22.04 (by releasing them ASAP, preferably for 21.04). I'm not familiar with Drake release policies though. How will you ensure compatibility moving forward? I see you bump external dependencies' versions quite often.

Also, it's worth mentioning that ROS distributions do not target non-LTS releases either. Foxy targets 20.04 and so will Galactic (to be released in May 2021).

@jwnimmer-tri
Copy link
Collaborator

(Aside: @jwnimmer-tri not @jeremyma-tri)

Thanks for the info.

Specifically, ensure spdlog 1.8.1 and fmt 7.1.0 get into Ubuntu 22.04 (by releasing them ASAP, preferably for 21.04).

Exactly. The precise version doesn't matter that much to us, it's more about just getting as many new features and bugfixes into 22.04 as possible, for the best chance of that version being something we can live with for four years. If we develop the relationships and workflow where we can regularly shepherd new versions of these two into Debian and/or Ubuntu, that pace will help ensure that 22.04 is the best it can be for our needs.

How will you ensure compatibility moving forward? I see you bump external dependencies' versions quite often.

For dependencies unavailable in Ubuntu, Drake generally uses a relatively recent tagged release from upstream and bumps it monthly, especially for private dependencies.

For public dependencies that are available in Ubuntu, we try to the Ubuntu build when possible, to improve binary compatibility with other software in the ecosystem (such as ROS). The question is whether there are any big bugs or missing features in the Ubuntu version. (For example, for Eigen, as of 18.04 we finally use the host version; but in 16.04 and prior it was too buggy for us, so we had to fetch a newer version from source, which was a compatibility headache.)

We haven't spent much time polishing Drake's 20.04 integration yet. There's a chance that we could swap to using the host spdlog on 20.04, even though it's a downgrade. Dealing with the API changes is probably not the hard part. The challenge is likely to be the vendored fmt getting in the way.

I'm sure that we could patch Drake to build against the older spdlog and (vendored) fmt. The question is what features and performance we lose, and whether we're willing to accept that trade-off either for the prebuilt binaries we deliver for many users, or for the Anzu builds within TRI.

I'll take some time to go look and refresh my memory about exactly how fmt and spdlog end up coupled in the source and ABI, and whether we can use inline namespaces or similar to get source and binary compatibility without having to downgrade fmt.

@jwnimmer-tri
Copy link
Collaborator

I'll take some time to go look and refresh my memory about exactly how fmt and spdlog end up coupled in the source and ABI, and whether we can use inline namespaces or similar to get source and binary compatibility without having to downgrade fmt.

My refreshed look is the same as it was a few years ago -- spdlog and fmt are intertwined enough that we must re-use whatever fmt sources that spdlog was compiled against.

If we use the focal spdlog 1.5.0, that implies using its bundled fmt 6.1.2.

@jamiesnape
Copy link
Contributor

jamiesnape commented Nov 20, 2020

Homebrew is up-to-date with regard both spdlog and fmt, so versions 1.8.1 and 7.1.2 respectively. Hopefully that Is not too wide a gap to bridge to the versions in Focal.

(I guess #14379 will tell us.)

@jwnimmer-tri
Copy link
Collaborator

My proposal for the next change to attempt here is as follows:

On 20.04 Focal, no longer compile fmt nor spdlog from source, nor install our builds of them. Instead, switch to using the host spdlog and its vendored fmt. This will effectively downgrade the version of both.

On 18.04 Bionic, continue compiling fmt and spdlog from source, and continue to install our builds of them. (The host version is way too old.) However, downgrade the pinned versions to match Focal -- it seems wrong to have Bionic be newer that Focal. An upgrade from Bionic to Focal should be strictly better, and if there are problems with the spdlog 1.5.0, I'd like to find them out early, and our day-to-day use is currently on Bionic.

On macOS, no longer compile fmt nor spdlog from source, nor install our builds of them. Instead, switch to using the homebrew formulae. Compared to building our own, this means that our binaries will break out from under us when homebrew upgrades those libraries, but we already have that problem with other libraries, and so we'll just have to live with it. Our macOS releases are not very durable anyway -- generally only the most recent one will function correctly, if that. The mitigation for that will be #12782, once it lands.

This leaves us with macOS using newer libraries than Ubuntu, which has some hazard of CI results shear, but we already have that for many other libraries, and given the small size of #14379 it will hopefully not be a substantial problem.

@hidmic
Copy link
Contributor Author

hidmic commented Dec 9, 2020

FYI spdlog 1.8.1 got into Debian Sid today: https://tracker.debian.org/news/1200036/accepted-spdlog-1181ds-1-source-into-unstable/ (thanks for the hint @j-rivero!).

@hidmic
Copy link
Contributor Author

hidmic commented Jan 11, 2021

FYI fmt 7.1.3 made into Debian Sid shortly after New Year's Eve: https://tracker.debian.org/news/1209212/accepted-fmtlib-713ds1-5-source-into-unstable/ 😃

@jwnimmer-tri
Copy link
Collaborator

Resolution from #14427:

On Ubuntu 18.04, we downgrade to spdlog 5.1.0 and fmt 6.1.2, but we still compile them from github source releases.

On Ubuntu 20.04, fmt and spdlog are obtained from the host now. (The host copy of fmt is also installed by Drake, because Ubuntu's packaging is deficient.) The versions are spdlog 5.1.0 with bundled fmt 6.1.2.

On macOS, fmt and spdlog are obtained from homebrew now. The versions are the latest available, per the usual homebrew policy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants