Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

//examples/kuka_iiwa_arm:kuka_simulation violates ODR due to :iiwa_common and :iiwa_lcm #8908

Closed
EricCousineau-TRI opened this issue May 31, 2018 · 3 comments

Comments

@EricCousineau-TRI
Copy link
Contributor

EricCousineau-TRI commented May 31, 2018

At present, iiwa_common and iiwa_lcm depend on static libraries within Drake (e.g. //multibody:rigid_body_tree), but kuka_simulation also links against libdrake.so via drake_example_cc_binary.

In drafting #8906, I realized this when digging into an issue captured by the following comments:
https://github.com/EricCousineau-TRI/drake/blob/59d9d37d8d6a5fb427b5872a8d85ed03698e756e/tools/skylark/drake_cc.bzl#L611-L617

With @sammy-tri's help, we found that when the example binary has srcs = ["//tools/install/libdrake:libdrake.so", ...], Bazel or the linker somehow magically realize that there are some duplications in the object code, and culls duplicate code from the resultant binary kuka_simulation.

To see the symbols:

objdump --all-headers -T -R bazel-bin/examples/kuka_iiwa_arm/kuka_simulation_test | c++filt | less

However, when moving libdrake.so to be in drake_shared_library's srcs, it seems that Bazel or the linker do not catch this case, and ends up duplicating globals such as std::set<int> RigidBodyTreeConstants::default_model_instance_id_set (which is a GSG violation, as it is not trivially destructible).
This manifests itself when the binary is running its exit handlers, and it ends up double-freeing memory within default_model_instance_id_set.

Potential solutions:

  • Avoid ODR violations, and somehow ensure that iiwa_common and iiwa_lcm do not double-declare their dependencies.
  • Change default_model_instance_id_set to never_destroyed<> as a stop-gap. This should be done regardless, but won't solve the issue since there will be other violations.

Ultimately, this seems to be pointing towards the pains of trying to offer both static and dynamically linked libraries from Bazel without a concrete plan (that I know of?) for offering both types of targets at the same level of granularity and with compatibility. (Bazel has plans for doing this hopefully well, but it seems to be stalled at the moment.)

@fbudin69500 This was the issue that we had discussed earlier, which we weren't sure if it was due to FindResource logic or not. This tends to indicate that it is not.

@jwnimmer-tri @jamiesnape Can I ask if y'all know what linker or Bazel magic is presently preventing this from being a problem on the current master?

EDIT: Reproduction commit:
EricCousineau-TRI@cf55503

@jamiesnape
Copy link
Contributor

With @sammy-tri's help, we found that when the example binary has srcs = ["//tools/install/libdrake:libdrake.so", ...], Bazel or the linker somehow magically realize that there are some duplications in the object code, and culls duplicate code from the resultant binary kuka_simulation.

Is this the case with both Clang and GCC? The behaviors are usually slightly different with respect to linking duplicate or unnecessary dependencies.

@jwnimmer-tri
Copy link
Collaborator

I don't know what Bazel is doing for the magic you cite.

Maybe we should revisit the choice to use libdrake.so dlopen path to anchor the find_resource calls for installed Drake. Having a stub shared library was another idea that we discussed at the time (see #7439 for libdrake_marker.so), and I still like that proposal better anyway. Then drake_example_cc_binary dies and everything works again.

@EricCousineau-TRI
Copy link
Contributor Author

EricCousineau-TRI commented May 31, 2018

Maybe we should revisit the choice to use libdrake.so dlopen [...]

Yup, that would solve the problem, and I assume should be relatively simple to implement. Will briefly tinker with that.

Is this the case with both Clang and GCC?

Confirmed that the same thing happens with GCC:
EricCousineau-TRI@4d687ce

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants