fix: correctly apply addrFilters in the dht #872

Jorropo · 2023-08-23T22:44:11Z

This still does not do the fullrt client but it wasn't doing it before either.

Fixes smart dialler regression from #870.

dht.go

pb/protocol_messenger.go

This still does not do the fullrt client but it wasn't doing it before either.

guillaumemichel

LGTM!

Jorropo · 2023-08-24T15:34:19Z

cc @dennis-tra @guillaumemichel, @BigLep really wants us to have a test here, I don't have time to do it, could you please pick it up ?

guillaumemichel · 2023-08-24T15:53:58Z

I cannot take this up at the moment 👎🏻

BigLep · 2023-08-24T20:20:36Z

@dennis-tra : are you able to take? My understanding is that #870 hit a regression. Bummer, but that's life. I like to hold to the standard that we at least add regression tests when we trigger breakage/bugs. (Or is this code that will all be going away in the next month with refactoring?)

dennis-tra · 2023-08-24T22:49:36Z

@BigLep, @Jorropo and I have had discussions around #870 and we think differently about it. From my point of view it's certainly not a regression but instead puts a spotlight on the missing filters.

#870 finally sets and increases the address TTL as intended in the code and suggested by RFM 17.1 (here are details about when the bug was introduced). We're trading increased pressure on the smart dialler against DHT walks. Filtering the addresses (this PR) is complementary and the right way to decrease pressure in the general case (and not only specifically because of the changes in #870).

The increased pressure on the smart dialler is dwarfed by the speed-up of skipping a second DHT walk to find the addresses.

That being said, I'm afraid I'm not sure what a regression test should test if I don't see a regression here.

I could verify that the literal changes of the code in this PR do what they aim to do - filter addresses when storing/fetching/sending multiaddresses to/from the peerstore or other peers. However, that's only implicitly related to increased pressure on the smart dialler and I wouldn't call it a regression test. Terminology aside, I'm happy to add these tests!

However, this area of the code will and has changed in the v2 codebase. I've already ported the handlers to the v2-develop branch in #864 where I also already test the address filtering.

One genuine note to the end, I really appreciate that @Jorropo you sharpen my eye for the implications of seemingly minor changes, as in #870.

Jorropo · 2023-08-25T05:39:26Z

@dennis-tra I don't understand what you mean, this is not reverting your change (#870), but applying address filters, right now your change cache private and public ips on the wan dht of kubo, instead only public ips will be cached.

dennis-tra · 2023-08-25T07:01:59Z

@Jorropo I see you're not reverting #870 here and I also understand the significance of not caching private IPs.

Let's get this change in and if we get the chance to chat in sync about it, do it during one of the next co-los.

yiannisbot · 2023-08-25T10:44:15Z

Terminology aside, I'm happy to add these tests!

The impact of the original change here is expected to be significant, i.e., avoiding a second DHT lookup (that's our hope and we've got all the measurements in place to monitor the result). So, I'd say let's add these tests (I agree they're not regression tests as this PR here doesn't fix a regression, but it's an optimisation on top of #870. I'd call them tests on the optimisation proposed by @Jorropo) and make sure this goes into the next libp2p release.

BigLep · 2023-08-25T17:26:11Z

Thanks all. I'm not close to the details and not going to dive in more unless needed. I just want to make sure that we hold the line of making sure the any new code we write or touch has the proper test coverage. Basically, we aren't going to go on a campaign to fix the sins of the past, but I want to make sure our present and future selves does the right thing, and when we get tripped up by decisions of the past that we bring them up to our desired standard. As long as we're following that principle, I'm good. And if we're not following it, I want to make sure we're upfront and intentional about it (where this can mean something like "not adding a backfill test because this code will be replaced in one month and the proper assertions have been added to the new code in commit X").

to catch timeouts

dennis-tra · 2023-08-29T11:32:16Z

added two tests where testing infrastructure was already present.

I think the serving part is missing. When we hand out addresses we should also filter them. When the peer connects to us to store provider records we'll store its addresses (private + public) in the peerstore because of the identify exchange. This means in the first 30min (the ttl for recently connected peers) we serve both. The same applies when we're connected to the peer.

guillaumemichel · 2023-08-30T12:22:19Z

Thank you @dennis-tra for the tests!

All calls to add peers to the peerstore go through maybeAddAddrs

go-libp2p-kad-dht/dht.go

Lines 923 to 936 in 42d3453

    
           func (dht *IpfsDHT) maybeAddAddrs(p peer.ID, addrs []ma.Multiaddr, ttl time.Duration) { 
        
           	// Don't add addresses for self or our connected peers. We have better ones. 
        
           	if p == dht.self || dht.host.Network().Connectedness(p) == network.Connected { 
        
           		return 
        
           	} 
        
           	dht.peerstore.AddAddrs(p, dht.filterAddrs(addrs), ttl) 
        
           } 
        
           func (dht *IpfsDHT) filterAddrs(addrs []ma.Multiaddr) []ma.Multiaddr { 
        
           	if f := dht.addrFilter; f != nil { 
        
           		return f(addrs) 
        
           	} 
        
           	return addrs 
        
           }

The only call that isn't filtered is the following:

go-libp2p-kad-dht/providers/providers_manager.go

Lines 249 to 268 in 42d3453

    
           // AddProvider adds a provider 
        
           func (pm *ProviderManager) AddProvider(ctx context.Context, k []byte, provInfo peer.AddrInfo) error { 
        
           	ctx, span := internal.StartSpan(ctx, "ProviderManager.AddProvider") 
        
           	defer span.End() 
        
           	if provInfo.ID != pm.self { // don't add own addrs. 
        
           		pm.pstore.AddAddrs(provInfo.ID, provInfo.Addrs, ProviderAddrTTL) 
        
           	} 
        
           	prov := &addProv{ 
        
           		ctx: ctx, 
        
           		key: k, 
        
           		val: provInfo.ID, 
        
           	} 
        
           	select { 
        
           	case pm.newprovs <- prov: 
        
           		return nil 
        
           	case <-ctx.Done(): 
        
           		return ctx.Err() 
        
           	} 
        
           }

We cannot call maybeAddAddrs because the ProviderManager doesn't have a reference to the IpfsDHT. Hence we would need to add a filter attribute to the ProviderManager and make sure that the mentioned call to peerstore.AddAddrs provides filtered addresses.

Edit:

The only caller to ProviderStore.AddProvider filters the multiaddrs before the call so we should be good.

go-libp2p-kad-dht/handlers.go

Lines 362 to 365 in 42d3453

    
           // We run the addrs filter after checking for the length, 
        
           // this allows transient nodes with varying /p2p-circuit addresses to still have their anouncement go through. 
        
           addrs := dht.filterAddrs(pi.Addrs) 
        
           dht.providerStore.AddProvider(ctx, key, peer.AddrInfo{ID: pi.ID, Addrs: addrs})

The other callers (IpfsDHT.Provide and FullRT.Provide) add content that is being provided by the host, and do not provide any addrs (because we don't want to add self to the peerstore).

guillaumemichel · 2023-08-30T12:46:45Z

@Jorropo can you have a last look before we merge this?

Jorropo · 2023-09-04T13:10:29Z

Thx a lot for doing this, LGTM I havn't reviewed the tests in detail, just the non test code. ❤️ 🙏

added tests for addrfilter fix: don't add unresponsive DHT servers to the Routing Table (libp2p#820) * added check to avoid adding unresponsive dht peers to the dht routing table * removed lock in adding peers to the rt * made variable names more meaningful * fixed network loop and corrected tests * added UsefulPeer() references from current PR * go mod tidy * added delay in TestRefreshBelowMinRTThreshold * addressed review * go mod tidy * addressed Jorropo review * added comments * removed state of peers probed recently * fix conflicts merge * updated deps * added optimizations documentation * Update dht.go * updated md files --------- Co-authored-by: Jorropo <[email protected]> fix: don't add unresponsive DHT servers to the Routing Table (libp2p#820) * added check to avoid adding unresponsive dht peers to the dht routing table * removed lock in adding peers to the rt * made variable names more meaningful * fixed network loop and corrected tests * added UsefulPeer() references from current PR * go mod tidy * added delay in TestRefreshBelowMinRTThreshold * addressed review * go mod tidy * addressed Jorropo review * added comments * removed state of peers probed recently * fix conflicts merge * updated deps * added optimizations documentation * Update dht.go * updated md files --------- Co-authored-by: Jorropo <[email protected]> (cherry picked from commit 8c9fdff) chore: release v0.24.0 chore: Update .github/workflows/stale.yml [skip ci] fix: leaking go routines fix: bump kbucket for abba bug refactor: remove goprocess fix: decrease tests noise, update kbucket and fix fixRTIUfNeeded chore: release v0.24.1 chore: update go-libp2p to a non deadlocky version chore: release v0.24.2 chore: Update .github/workflows/stale.yml [skip ci] tracing: fix DHT keys as string attribute not being valid utf-8 chore: release v0.24.3 fix the server mode log in handleNewMessage by changing it to debug chore: delete templates [skip ci] (libp2p#861) ci: uci/copy-templates (libp2p#862) * chore: add or force update .github/workflows/go-test.yml * chore: add or force update .github/workflows/go-check.yml * chore: add or force update .github/workflows/releaser.yml * chore: add or force update .github/workflows/release-check.yml * chore: add or force update .github/workflows/tagpush.yml chore: bump go.mod to Go 1.20 and run go fix chore: bump go-libp2p and/or quic-go to latest version tracing: add protocol messages client tracing chore: release v0.25.0 add provider record addresses to peerstore fixes issue libp2p#868 fixing base32 import fix: apply addrFilters in the dht (libp2p#872) * fix: correctly apply addrFilters in the dht This still does not do the fullrt client but it wasn't doing it before either. * Add address filter tests * use channel instead of waitgroup to catch timeouts * filter multiaddresses when serving provider records --------- Co-authored-by: Dennis Trautwein <[email protected]> fix: properly iterate in tracing for protocol messenger MB Copy Paste Typo, This caused panics while tracing. chore: use go-libp2p-routing-helpers for tracing needs go-libp2p-routing-helpers has an optimized implementation that does nothing if we are not tracing, it also properly log all IO of the request. perf: don't buffer the output of FindProvidersAsync We always select it against ctx and we don't rely on the fact it will always be non blocking since when count is zero we can't realistically preallocate. Buffering use more memory makes more garbage and is less efficient than direct copies. chore: release v0.25.1 add ctx canceled err check return canceled err not retry reset stream and return err when ctx canceled release v0.25.2 chore: Update .github/workflows/stale.yml [skip ci] chore: bump go.mod to Go 1.21 and run go fix chore: run go mod tidy chore: add or force update .github/workflows/go-test.yml chore: add or force update .github/workflows/go-check.yml chore: add or force update .github/workflows/releaser.yml chore: add or force update .github/workflows/release-check.yml chore: add or force update .github/workflows/tagpush.yml chore: add or force update .github/workflows/go-test.yml chore: add or force update .github/workflows/go-check.yml chore: add or force update .github/workflows/releaser.yml chore: add or force update .github/workflows/release-check.yml chore: add or force update .github/workflows/tagpush.yml chore: add or force update .github/workflows/go-test.yml chore: add or force update .github/workflows/go-check.yml chore: add or force update .github/workflows/releaser.yml chore: add or force update .github/workflows/release-check.yml chore: add or force update .github/workflows/tagpush.yml findnode(self) now returns multiple peers don't lookup check if not enough peers in table enhanced tests moved rt size check after request Upgrade to go-log v2.5.1

Jorropo requested a review from dennis-tra August 23, 2023 22:44

Jorropo requested review from guillaumemichel and a team as code owners August 23, 2023 22:44

dennis-tra reviewed Aug 24, 2023

View reviewed changes

dht.go Show resolved Hide resolved

pb/protocol_messenger.go Outdated Show resolved Hide resolved

Jorropo requested a review from dennis-tra August 24, 2023 15:00

Jorropo force-pushed the stop-sending-addrs-in-add-provide branch from a3e39f6 to 16835eb Compare August 24, 2023 15:05

fix: correctly apply addrFilters in the dht

fe4a592

This still does not do the fullrt client but it wasn't doing it before either.

Jorropo force-pushed the stop-sending-addrs-in-add-provide branch from 16835eb to fe4a592 Compare August 24, 2023 15:12

guillaumemichel approved these changes Aug 24, 2023

View reviewed changes

Jorropo mentioned this pull request Aug 25, 2023

chore: update go-libp2p-kad-dht ipfs/kubo#10100

Closed

dennis-tra added 2 commits August 29, 2023 13:23

Add address filter tests

45e8fab

use channel instead of waitgroup

42d3453

to catch timeouts

guillaumemichel self-requested a review August 30, 2023 09:32

guillaumemichel approved these changes Aug 30, 2023

View reviewed changes

filter multiaddresses when serving provider records

2114147

guillaumemichel approved these changes Aug 30, 2023

View reviewed changes

guillaumemichel mentioned this pull request Aug 30, 2023

Filter addresses advertised by Kademlia #841

Closed

Jorropo merged commit 0c90569 into master Sep 4, 2023
9 checks passed

Jorropo deleted the stop-sending-addrs-in-add-provide branch September 4, 2023 13:12

Jorropo mentioned this pull request Sep 4, 2023

chore: release v0.25.1 #876

Merged

phillebaba mentioned this pull request Sep 26, 2023

Replace factory with adress filter to remove loopback addresses spegel-org/spegel#211

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correctly apply addrFilters in the dht #872

fix: correctly apply addrFilters in the dht #872

Jorropo commented Aug 23, 2023 •

edited

Loading

guillaumemichel left a comment

Jorropo commented Aug 24, 2023

guillaumemichel commented Aug 24, 2023

BigLep commented Aug 24, 2023

dennis-tra commented Aug 24, 2023 •

edited

Loading

Jorropo commented Aug 25, 2023

dennis-tra commented Aug 25, 2023 •

edited

Loading

yiannisbot commented Aug 25, 2023

BigLep commented Aug 25, 2023

dennis-tra commented Aug 29, 2023 •

edited

Loading

guillaumemichel commented Aug 30, 2023 •

edited

Loading

guillaumemichel commented Aug 30, 2023

Jorropo commented Sep 4, 2023

fix: correctly apply addrFilters in the dht #872

fix: correctly apply addrFilters in the dht #872

Conversation

Jorropo commented Aug 23, 2023 • edited Loading

guillaumemichel left a comment

Choose a reason for hiding this comment

Jorropo commented Aug 24, 2023

guillaumemichel commented Aug 24, 2023

BigLep commented Aug 24, 2023

dennis-tra commented Aug 24, 2023 • edited Loading

Jorropo commented Aug 25, 2023

dennis-tra commented Aug 25, 2023 • edited Loading

yiannisbot commented Aug 25, 2023

BigLep commented Aug 25, 2023

dennis-tra commented Aug 29, 2023 • edited Loading

guillaumemichel commented Aug 30, 2023 • edited Loading

guillaumemichel commented Aug 30, 2023

Jorropo commented Sep 4, 2023

Jorropo commented Aug 23, 2023 •

edited

Loading

dennis-tra commented Aug 24, 2023 •

edited

Loading

dennis-tra commented Aug 25, 2023 •

edited

Loading

dennis-tra commented Aug 29, 2023 •

edited

Loading

guillaumemichel commented Aug 30, 2023 •

edited

Loading