Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MdnsAvahi destructor segfault; require separate shutdown #8001

Closed
msandstedt opened this issue Jun 29, 2021 · 6 comments
Closed

MdnsAvahi destructor segfault; require separate shutdown #8001

msandstedt opened this issue Jun 29, 2021 · 6 comments

Comments

@msandstedt
Copy link
Contributor

Problem

MdnsAvahi is asymmetric in its Constructor / Init / Destructor behavior.

The constructor does very little. Only in Init do we call avahi_client_new. That's largely consistent with the pattern in the rest of the stack. However, the destructor calls avahi_client_free.

We need this to happen in a separate Shutdown method. With the code as it is currently, we can see the following:

When avahi_client_free() is invoked, it will subsequently invoke the AvahiPoller->watch_free method which in this case would trigger the following code (Link), which then executes the following line of code:

(void) watch.mSocket.ReleaseFD();

This then sets off another chain of functions as follows:

-> void WatchableSocket::OnClose()
-> mSharedState->WakeSelect(); or WatchableEventManager::WakeSelect()
-> mSystemLayer->WakeIOThread();

But mSystemLayer is nullptr after Layer::shutdown, which happens prior to ~MdnsAvahi, so we have a segfault.

Another concern is the singleton pattern. Why is MdnsAvahi a singleton? This shouldn't be necessary and doesn't appear to provide any value. It also appears to force Resolve and Publish into the same class, when logically they don't need to be.

Proposed Solution

  • Refactor with shutdown method at least for MdnsAvahi, and possibly all impls
  • Remove unnecessary singleton pattern from MdnsAvahi
  • Consider separating MdnsAvahi into distinct resolver and publisher classes
@kpschoedel
Copy link
Contributor

I've just hit this while trying to address a separate problem in Linux/MdnsImpl.

IMO we need a coherent startup/shutdown plan. I recently fixed some use-after-destruction cases in unit tests, but of course those have no impact on production use.

For this case I assume we need a ChipMdnsShutdown to match ChipMdnsInit, but I don't know where that fits into overall flow.

@kpschoedel
Copy link
Contributor

Will take this since it blocks #7715

kpschoedel added a commit to kpschoedel/connectedhomeip that referenced this issue Aug 18, 2021
#### Problem

Issue project-chip#8001 MdnsAvahi destructor segfault; require separate shutdown

MdnsAvahi can cause a crash on exit due to a static instance running
code in its destructor after dependent components have shut down.

#### Change overview

- Add `ChipMdnsShutdown()`. This is a narrow fix, which does not attempt
  to address other potential problems with singleton mDNS state.
- Fix TestMdns to shut down cleanly.

#### Testing

- TestMdns and CI.
kpschoedel added a commit to kpschoedel/connectedhomeip that referenced this issue Aug 18, 2021
#### Problem

Issue project-chip#8001 MdnsAvahi destructor segfault; require separate shutdown

MdnsAvahi can cause a crash on exit due to a static instance running
code in its destructor after dependent components have shut down.

#### Change overview

- Add `ChipMdnsShutdown()`. This is a narrow fix, which does not attempt
  to address other potential problems with singleton mDNS state.
- Fix TestMdns to shut down cleanly.

#### Testing

- TestMdns and CI.
kpschoedel added a commit to kpschoedel/connectedhomeip that referenced this issue Aug 19, 2021
#### Problem

Issue project-chip#8001 MdnsAvahi destructor segfault; require separate shutdown

MdnsAvahi can cause a crash on exit due to a static instance running
code in its destructor after dependent components have shut down.

#### Change overview

- Add `ChipMdnsShutdown()`. This is a narrow fix, which does not attempt
  to address other potential problems with singleton mDNS state.
- Fix TestMdns to shut down cleanly.

#### Testing

- TestMdns and CI.
kpschoedel added a commit to kpschoedel/connectedhomeip that referenced this issue Aug 19, 2021
#### Problem

Issue project-chip#8001 MdnsAvahi destructor segfault; require separate shutdown

MdnsAvahi can cause a crash on exit due to a static instance running
code in its destructor after dependent components have shut down.

#### Change overview

- Add `ChipMdnsShutdown()`. This is a narrow fix, which does not attempt
  to address other potential problems with singleton mDNS state.
- Fix TestMdns to shut down cleanly.

#### Testing

- TestMdns and CI.
kpschoedel added a commit to kpschoedel/connectedhomeip that referenced this issue Aug 19, 2021
#### Problem

Issue project-chip#8001 MdnsAvahi destructor segfault; require separate shutdown

MdnsAvahi can cause a crash on exit due to a static instance running
code in its destructor after dependent components have shut down.

#### Change overview

- Add `ChipMdnsShutdown()`. This is a narrow fix, which does not attempt
  to address other potential problems with singleton mDNS state.
- Fix TestMdns to shut down cleanly.

#### Testing

- TestMdns and CI.
woody-apple pushed a commit that referenced this issue Aug 19, 2021
* Add ChipMdnsShutdown()

#### Problem

Issue #8001 MdnsAvahi destructor segfault; require separate shutdown

MdnsAvahi can cause a crash on exit due to a static instance running
code in its destructor after dependent components have shut down.

#### Change overview

- Add `ChipMdnsShutdown()`. This is a narrow fix, which does not attempt
  to address other potential problems with singleton mDNS state.
- Fix TestMdns to shut down cleanly.

#### Testing

- TestMdns and CI.

* restyle
@kpschoedel kpschoedel removed their assignment Aug 20, 2021
@kpschoedel
Copy link
Contributor

The segfault is fixed by #9108; the second part isn't.

@msandstedt
Copy link
Contributor Author

The segfault is fixed by #9108; the second part isn't.

I assume you're referring to my call out of the singleton pattern?

I would be interested to know your thoughts. I happen to be using avahi for the most part. Avahi seems to allow pretty unlimited, parallel resolve calls. So I'm not sure it's a show stopper. I believe there is shared state though for browse, which is frustrating.

Do we have justification for the singleton pattern here? Is this something we must do e.g. to have an implementation that's exactly equivalent to minimal mDNS?

I would be willing to accept it. But I'd really like to know why we decided this should be a singleton.

@kpschoedel
Copy link
Contributor

MdnsAvahi isn't unique in being a singleton; so are DiscoveryImplPlatform and ServiceAdvertiser at least, and the ChipMdns… set of functions are global with global state.

Pragmatically, the headline segfault became an obstacle to the ongoing System::Layer changes, so I fixed that. I didn't provide an implementation for ChipMdnsShutdown() on any other platforms, since I don't currently know how to do so, or whether it's even possible in every case. I also don't know the CHIP mDNS system well enough to propagate it up the stack correctly; I gather that DiscoveryImplPlatform and ServiceAdvertiser would get a Shutdown method, but not where that would be used.

@msandstedt
Copy link
Contributor Author

Closing with the fix for the primary bug here, which was inability to cleanly shutdown Avahi in the Linux platform.

It sounds like there are things we don't really like about the current design. However, if we want to tackle that, we can start another ticket.

Thanks @kpschoedel for the fix.

nikita-s-wrk pushed a commit to nikita-s-wrk/connectedhomeip that referenced this issue Sep 23, 2021
* Add ChipMdnsShutdown()

#### Problem

Issue project-chip#8001 MdnsAvahi destructor segfault; require separate shutdown

MdnsAvahi can cause a crash on exit due to a static instance running
code in its destructor after dependent components have shut down.

#### Change overview

- Add `ChipMdnsShutdown()`. This is a narrow fix, which does not attempt
  to address other potential problems with singleton mDNS state.
- Fix TestMdns to shut down cleanly.

#### Testing

- TestMdns and CI.

* restyle
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants