Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignition: Should remove all static storage duration (unless trivially destructible) #15601

Closed
2 tasks
EricCousineau-TRI opened this issue Aug 11, 2021 · 9 comments
Closed
2 tasks
Assignees
Labels
component: build system Bazel, CMake, dependencies, memory checkers, linters priority: low unused team: manipulation

Comments

@EricCousineau-TRI
Copy link
Contributor

EricCousineau-TRI commented Aug 11, 2021

@SeanCurtis-TRI ran into a nasty segfault in pydrake that is most likely due to static storage duration for objects w/ non-trivial destructors:
https://gist.github.com/SeanCurtis-TRI/8122cb2ad2b78f99f91ab35d19ff931c
Slack thread: https://drakedevelopers.slack.com/archives/C3YB3EV5W/p1628696745011100

To observe the phenomenon, the following branch will manifest the problem:
https://github.com/SeanCurtis-TRI/drake/tree/TEMP_showcase_angry_ignition
Simply execute bazel test //bindings/pydrake:py/math_test to see the failure.

Suggested steps:

  • Patch how Drake consumes ignition to not use any non-trivially destructible (or constructible) objects with static storage duration - which acutely affect us
  • Upstream those patches (hopefully only within the ignition ecosystem)
  • Submit upstream PRs to remove any other violations of this GSG style guide rule

\cc @jwnimmer-tri

@jwnimmer-tri
Copy link
Collaborator

Suggested steps:

  • Patch how Drake consumes ignition to not use any non-trivially destructible (or constructible) objects with static storage duration
  • Upstream those patches (hopefully only within the ignition ecosystem)

I'd advocate for a slightly different approach. For anything that is actively hurting us (with segfaults, or etc.), then we can have a Drake-local patch file that is upstreamed later. But for all of the other global variable coding-style violations that are not hurting us, I don't see why we'd make Drake-specific patches for those? It seems like PRing those directly to upstream and waiting for a release should be sufficient.

I see that ignition & sdformat have adopted the Google Style Guide, which means that statics like this are forbidden by rule. So the other thing to work on is to help remind the code reviewers for those projects of that rule, so that future PRs don't introduce regressions on this front.

@EricCousineau-TRI
Copy link
Contributor Author

Sounds great! Updated to that effect.

@azeey
Copy link
Contributor

azeey commented Aug 12, 2021

When I ran the command, I actually get a stack trace that shows the issue is in libyaml-cpp, but I went ahead removed all non-trivially destructible objects from ignition-math anyway. However, that didn't solve the problem. I found that libyaml-cpp also uses a global std::string array, which can easily be converted to an array of const char *, so I did that, but that didn't work either. Digging a little deeper, I found out that before the segfault happens, python fails to import a pydrake module here:

from pydrake.common.deprecation import DrakeDeprecationWarning

This fails because of a missing symbol drake::geometry::SceneGraph<drake::symbolic::Expression>::model_inspector() const. My guess is the segfault also occurs because of the missing symbol when loading the test module via SourceFileLoader.load_module():

module = SourceFileLoader(test_name, runfiles_test_filename).load_module(

I believe the issue is that the code in @SeanCurtis-TRI's branch is instantiating drake::multibody::ContactResultsToLcmSystem with the DRAKE_DEFINE_CLASS_TEMPLATE_INSTANTIATIONS_ON_DEFAULT_SCALARS macro which instantiates drake::multibody::ContactResultsToLcmSystem<drake::symbolic::Expression>, which then tries to instantiate drake::geometry::SceneGraph<drake::symbolic::Expression> and fails. This diff fixes the segfault for me:

@@ -293,8 +293,8 @@ systems::lcm::LcmPublisherSystem* ConnectContactResultsToDrakeVisualizer(
       builder, multibody_plant, nullptr, contact_results_port, lcm);
 }
 
+template class ContactResultsToLcmSystem<double>;
+template class ContactResultsToLcmSystem<AutoDiffXd>;
+
 }  // namespace multibody
 }  // namespace drake
-
-DRAKE_DEFINE_CLASS_TEMPLATE_INSTANTIATIONS_ON_DEFAULT_SCALARS(
-    class drake::multibody::ContactResultsToLcmSystem)

@jwnimmer-tri
Copy link
Collaborator

Good find! That sounds very plausible.

@SeanCurtis-TRI
Copy link
Contributor

That's not just plausible, that's incredibly compelling. Thanks for the sleuthing, @azeey! It makes perfect sense. I'll have to consider the proper Drake-side resolution.

@EricCousineau-TRI
Copy link
Contributor Author

EricCousineau-TRI commented Aug 12, 2021

Nice indeed! Given discussion, lowering priority, and updating checkboxes to focus solely on upstream.
That being said, it would still be nice to remove non-trivially destructible globals, at least from ignition.

At that point, the symptoms (via stack trace) might have been slightly less confusing.

@jwnimmer-tri jwnimmer-tri changed the title ignition: Should remove all static storage duration (unless trivially destructible) - and in upstream ignition: Should remove all static storage duration (unless trivially destructible) Aug 18, 2021
@jwnimmer-tri jwnimmer-tri added component: build system Bazel, CMake, dependencies, memory checkers, linters and removed component: pydrake Python API and its supporting Starlark macros labels Nov 12, 2021
@jwnimmer-tri
Copy link
Collaborator

I would like to close this issue, in lieu of an upstream ticket in ignition.

@azeey have you already fixed all of the instances of static initialization and destruction in ignition? If yes, let's close this Drake issue. If not, could I ask you open open an ignition issue to that effect, backlink it here, and then close this Drake issue?

@azeey
Copy link
Contributor

azeey commented Nov 12, 2021

No, I haven't yet. I'll create an issue in ign-math. There's already one for sdformat: gazebosim/sdformat#552

@jwnimmer-tri
Copy link
Collaborator

If I'm reading it correctly, an issue already exists -- gazebosim/gz-math#269.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: build system Bazel, CMake, dependencies, memory checkers, linters priority: low unused team: manipulation
Projects
None yet
Development

No branches or pull requests

4 participants