Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Darwin DNS-SD implementation stops browsing services upon finding a wrong service. #19194

Closed
kpark-apple opened this issue Jun 3, 2022 · 2 comments · Fixed by #22823
Closed

Comments

@kpark-apple
Copy link
Contributor

Problem

Darwin chip::Dnssd::Browse() stops browsing when OnBrowse() was called without kDNSServiceFlagsMoreComing flag.
Per https://developer.apple.com/documentation/dnssd/1823436-anonymous/kdnsserviceflagsmorecoming?language=objc, the flag only means that there are no more "queued" results.
The issue with it is that if the desired service is discovered later than an undesirable service (service from another device for example), the desired service is never discovered within the timeout, causing failure.

Proposed Solution

Remove sdCtx->Finalize() call with the flag check, and make sure the context is finalized when desired service was found or when timeout occurs.

@bzbarsky-apple
Copy link
Contributor

There is no timeout on browse right now, fwiw...

I guess since we switched to browsing for the _CM subtype there's a problem if multiple commissionable things are all advertising because the discriminator filter is applied too late? Is that why this is being a problem? How often does this behavior manifest as an issue?

The proposed solution on its own is not enough. The caller of Browse can only handle a single callback, so we would need to either change that or always keep going until the timeout and deliver all the results we found, which would leave to pretty undesirable behavior. And the caller is the generic platform mdns code, so changing that would involve changing all the other platform implementations too...

@kpark-apple
Copy link
Contributor Author

kpark-apple commented Jun 4, 2022

I guess since we switched to browsing for the _CM subtype there's a problem if multiple commissionable things are all advertising because the discriminator filter is applied too late? Is that why this is being a problem? How often does this behavior manifest as an issue?

Yes, that is the problem and it happened very frequently with test team. It may be unique in test setups where they might have some other commissionable node running. However, personally I would think it could happen frequently in real world, too, if the commissioning window is as long as a minute or so.

bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this issue Jun 7, 2022
This reverts commit a689c84.

Now that we always register a new instance name when opening a new
commissioning window the problem PR project-chip#17356 was trying to work around
no longer applies.  On the other hand, the new setup introduced a new
problem: if there are multiple things advertising the _CM subtype
(i.e. multiple things in comissioning mode at once), then we might
find the first several (however much fits in a DNS packet) and then
platform mdns will stop delivering results, per
project-chip#19194 (which is
about Darwin, but other platforms have similar issues).

If we browse by discrimnator instead, the chance of multiple results
is much lower, and hence the chance of finding the thing we care about
is much higher.
bzbarsky-apple added a commit that referenced this issue Jun 7, 2022
)

This reverts commit a689c84.

Now that we always register a new instance name when opening a new
commissioning window the problem PR #17356 was trying to work around
no longer applies.  On the other hand, the new setup introduced a new
problem: if there are multiple things advertising the _CM subtype
(i.e. multiple things in comissioning mode at once), then we might
find the first several (however much fits in a DNS packet) and then
platform mdns will stop delivering results, per
#19194 (which is
about Darwin, but other platforms have similar issues).

If we browse by discrimnator instead, the chance of multiple results
is much lower, and hence the chance of finding the thing we care about
is much higher.
@woody-apple woody-apple added V1.X and removed V1.0 labels Jun 14, 2022
bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this issue Sep 22, 2022
Most backends don't implement this yet. Darwin does, and no longer
stops Browse operations itself.

Fixes project-chip#19194

May provide a way toward fixing
project-chip#13275
bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this issue Sep 22, 2022
Most backends don't implement this yet. Darwin does, and no longer
stops Browse operations itself.

Fixes project-chip#19194

May provide a way toward fixing
project-chip#13275
bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this issue Oct 3, 2022
Most backends don't implement this yet. Darwin does, and no longer
stops Browse operations itself.

Fixes project-chip#19194

May provide a way toward fixing
project-chip#13275
andy31415 pushed a commit that referenced this issue Oct 4, 2022
* Add an API to stop a DNS-SD browse operation.

Most backends don't implement this yet. Darwin does, and no longer
stops Browse operations itself.

Fixes #19194

May provide a way toward fixing
#13275

* Address review comments.

* Address more review comments.
rawadhilal88 pushed a commit to sharadb-amazon/connectedhomeip that referenced this issue Feb 22, 2023
…at fixes DNS-SD browsing

Add an API to stop a DNS-SD browse operation. (project-chip#22823)

* Add an API to stop a DNS-SD browse operation.

Most backends don't implement this yet. Darwin does, and no longer
stops Browse operations itself.

Fixes project-chip#19194

May provide a way toward fixing
project-chip#13275

* Address review comments.

* Address more review comments.

[darwin] Use DNSServiceReconfirmRecord for A and AAAA records to miti… (project-chip#23067)

* [Dnssd] Add ReconfirmRecord method to verify address that appears to be out of date

* [SetUpCodePairer] Ask Dnssd to reconfirm discovered addresses if connecting to them ends with a CHIP_ERROR_TIMEOUT

Fix Logging When Trying to Log Nullptr To Strings (project-chip#23604)

This PR attempts to identify all cases where %s specifiers in the logging APIs
(ChipLogError(), ChipLogProgress(), ChipLogDetail()) don't have a guaranteed
non-null string parameter.

In all identified cases the issue is fixed using StringOrNullMarker() helper
method to guarantee it doesn't happen.

Use the "right" byte-swapping function for port in Darwin DnssdImpl. (project-chip#23894)

The incoming port is in host byte order and we are converting to network byte
order, so should use htons (which happens to do the same thing as ntohs, so no
behavior change).

Co-authored-by: Andrei Litvin <[email protected]>

Add a way for Resolver consumers to cancel operational resolve attempts. (project-chip#24010)

* Add a way for Resolver consumers to cancel operational resolve attempts.

Adds a way for consumers to notify Resolver when they no longer care
about an operational resolve, so a Resolver implementation can keep
track of how many consumers are interested and stop work as desired if
no one is interested.

Fixes project-chip#23881

* Address review comments.

* Address review comments.

Make sure we stop resolves triggered by a browse when the browse stops on Darwin. (project-chip#24733)

* Make sure we stop resolves triggered by a browse when the browse stops on Darwin.

Without this change, if there is a PTR record that matches whatever we are
browsing but no corresponding SRV record, we would end up leaking a resolve
forever.

Tested by modifying minimal mdns SrvResponder::AddAllResponses to no-op instead
of actually adding any responses, then trying to commission the device running
the modified minimal mdns.  Without this change, when the browse stops the
resolves it triggered keep going.  With this change, termination of the browse
also terminates the resolves.

Fixes project-chip#24074

* Also avoid leaking ResolveContext instances.

* Fix handling of multiple interfaces.

* Address review comment.

Improve discovery logging on Darwin. (project-chip#24846)

1) Use progress, not detail, logging, because detail logging is not actually
   persisted in system logs.
2) Add logging to a few functions that were missing it.

Remove the address type argument from ResolveNodeId. (project-chip#24006)

All consumers were passing kAny in practice, and some of the backends
(e.g. minimal mdns) had no capability to filter by type anyway.
rawadhilal88 pushed a commit to sharadb-amazon/connectedhomeip that referenced this issue Feb 22, 2023
* Add an API to stop a DNS-SD browse operation.

Most backends don't implement this yet. Darwin does, and no longer
stops Browse operations itself.

Fixes project-chip#19194

May provide a way toward fixing
project-chip#13275

* Address review comments.

* Address more review comments.
cliffamzn pushed a commit to sharadb-amazon/connectedhomeip that referenced this issue Feb 23, 2023
* Add an API to stop a DNS-SD browse operation.

Most backends don't implement this yet. Darwin does, and no longer
stops Browse operations itself.

Fixes project-chip#19194

May provide a way toward fixing
project-chip#13275

* Address review comments.

* Address more review comments.
sharadb-amazon pushed a commit to sharadb-amazon/connectedhomeip that referenced this issue Mar 9, 2023
* Add an API to stop a DNS-SD browse operation.

Most backends don't implement this yet. Darwin does, and no longer
stops Browse operations itself.

Fixes project-chip#19194

May provide a way toward fixing
project-chip#13275

* Address review comments.

* Address more review comments.
sharadb-amazon pushed a commit to sharadb-amazon/connectedhomeip that referenced this issue Mar 17, 2023
* Add an API to stop a DNS-SD browse operation.

Most backends don't implement this yet. Darwin does, and no longer
stops Browse operations itself.

Fixes project-chip#19194

May provide a way toward fixing
project-chip#13275

* Address review comments.

* Address more review comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants