-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: correctly apply addrFilters in the dht #872
Conversation
a3e39f6
to
16835eb
Compare
This still does not do the fullrt client but it wasn't doing it before either.
16835eb
to
fe4a592
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
cc @dennis-tra @guillaumemichel, @BigLep really wants us to have a test here, I don't have time to do it, could you please pick it up ? |
I cannot take this up at the moment 👎🏻 |
@dennis-tra : are you able to take? My understanding is that #870 hit a regression. Bummer, but that's life. I like to hold to the standard that we at least add regression tests when we trigger breakage/bugs. (Or is this code that will all be going away in the next month with refactoring?) |
@BigLep, @Jorropo and I have had discussions around #870 and we think differently about it. From my point of view it's certainly not a regression but instead puts a spotlight on the missing filters. #870 finally sets and increases the address TTL as intended in the code and suggested by RFM 17.1 (here are details about when the bug was introduced). We're trading increased pressure on the smart dialler against DHT walks. Filtering the addresses (this PR) is complementary and the right way to decrease pressure in the general case (and not only specifically because of the changes in #870). The increased pressure on the smart dialler is dwarfed by the speed-up of skipping a second DHT walk to find the addresses. That being said, I'm afraid I'm not sure what a regression test should test if I don't see a regression here. I could verify that the literal changes of the code in this PR do what they aim to do - filter addresses when storing/fetching/sending multiaddresses to/from the peerstore or other peers. However, that's only implicitly related to increased pressure on the smart dialler and I wouldn't call it a regression test. Terminology aside, I'm happy to add these tests! However, this area of the code will and has changed in the v2 codebase. I've already ported the handlers to the One genuine note to the end, I really appreciate that @Jorropo you sharpen my eye for the implications of seemingly minor changes, as in #870. |
@dennis-tra I don't understand what you mean, this is not reverting your change (#870), but applying address filters, right now your change cache private and public ips on the wan dht of kubo, instead only public ips will be cached. |
The impact of the original change here is expected to be significant, i.e., avoiding a second DHT lookup (that's our hope and we've got all the measurements in place to monitor the result). So, I'd say let's add these tests (I agree they're not regression tests as this PR here doesn't fix a regression, but it's an optimisation on top of #870. I'd call them tests on the optimisation proposed by @Jorropo) and make sure this goes into the next libp2p release. |
Thanks all. I'm not close to the details and not going to dive in more unless needed. I just want to make sure that we hold the line of making sure the any new code we write or touch has the proper test coverage. Basically, we aren't going to go on a campaign to fix the sins of the past, but I want to make sure our present and future selves does the right thing, and when we get tripped up by decisions of the past that we bring them up to our desired standard. As long as we're following that principle, I'm good. And if we're not following it, I want to make sure we're upfront and intentional about it (where this can mean something like "not adding a backfill test because this code will be replaced in one month and the proper assertions have been added to the new code in commit X"). |
to catch timeouts
added two tests where testing infrastructure was already present. I think the serving part is missing. When we hand out addresses we should also filter them. When the peer connects to us to store provider records we'll store its addresses (private + public) in the peerstore because of the identify exchange. This means in the first 30min (the ttl for recently connected peers) we serve both. The same applies when we're connected to the peer. |
Thank you @dennis-tra for the tests! All calls to add peers to the peerstore go through Lines 923 to 936 in 42d3453
The only call that isn't filtered is the following: go-libp2p-kad-dht/providers/providers_manager.go Lines 249 to 268 in 42d3453
Edit: The only caller to Lines 362 to 365 in 42d3453
The other callers ( |
@Jorropo can you have a last look before we merge this? |
Thx a lot for doing this, LGTM I havn't reviewed the tests in detail, just the non test code. ❤️ 🙏 |
added tests for addrfilter fix: don't add unresponsive DHT servers to the Routing Table (libp2p#820) * added check to avoid adding unresponsive dht peers to the dht routing table * removed lock in adding peers to the rt * made variable names more meaningful * fixed network loop and corrected tests * added UsefulPeer() references from current PR * go mod tidy * added delay in TestRefreshBelowMinRTThreshold * addressed review * go mod tidy * addressed Jorropo review * added comments * removed state of peers probed recently * fix conflicts merge * updated deps * added optimizations documentation * Update dht.go * updated md files --------- Co-authored-by: Jorropo <[email protected]> fix: don't add unresponsive DHT servers to the Routing Table (libp2p#820) * added check to avoid adding unresponsive dht peers to the dht routing table * removed lock in adding peers to the rt * made variable names more meaningful * fixed network loop and corrected tests * added UsefulPeer() references from current PR * go mod tidy * added delay in TestRefreshBelowMinRTThreshold * addressed review * go mod tidy * addressed Jorropo review * added comments * removed state of peers probed recently * fix conflicts merge * updated deps * added optimizations documentation * Update dht.go * updated md files --------- Co-authored-by: Jorropo <[email protected]> (cherry picked from commit 8c9fdff) chore: release v0.24.0 chore: Update .github/workflows/stale.yml [skip ci] fix: leaking go routines fix: bump kbucket for abba bug refactor: remove goprocess fix: decrease tests noise, update kbucket and fix fixRTIUfNeeded chore: release v0.24.1 chore: update go-libp2p to a non deadlocky version chore: release v0.24.2 chore: Update .github/workflows/stale.yml [skip ci] tracing: fix DHT keys as string attribute not being valid utf-8 chore: release v0.24.3 fix the server mode log in handleNewMessage by changing it to debug chore: delete templates [skip ci] (libp2p#861) ci: uci/copy-templates (libp2p#862) * chore: add or force update .github/workflows/go-test.yml * chore: add or force update .github/workflows/go-check.yml * chore: add or force update .github/workflows/releaser.yml * chore: add or force update .github/workflows/release-check.yml * chore: add or force update .github/workflows/tagpush.yml chore: bump go.mod to Go 1.20 and run go fix chore: bump go-libp2p and/or quic-go to latest version tracing: add protocol messages client tracing chore: release v0.25.0 add provider record addresses to peerstore fixes issue libp2p#868 fixing base32 import fix: apply addrFilters in the dht (libp2p#872) * fix: correctly apply addrFilters in the dht This still does not do the fullrt client but it wasn't doing it before either. * Add address filter tests * use channel instead of waitgroup to catch timeouts * filter multiaddresses when serving provider records --------- Co-authored-by: Dennis Trautwein <[email protected]> fix: properly iterate in tracing for protocol messenger MB Copy Paste Typo, This caused panics while tracing. chore: use go-libp2p-routing-helpers for tracing needs go-libp2p-routing-helpers has an optimized implementation that does nothing if we are not tracing, it also properly log all IO of the request. perf: don't buffer the output of FindProvidersAsync We always select it against ctx and we don't rely on the fact it will always be non blocking since when count is zero we can't realistically preallocate. Buffering use more memory makes more garbage and is less efficient than direct copies. chore: release v0.25.1 add ctx canceled err check return canceled err not retry reset stream and return err when ctx canceled release v0.25.2 chore: Update .github/workflows/stale.yml [skip ci] chore: bump go.mod to Go 1.21 and run go fix chore: run go mod tidy chore: add or force update .github/workflows/go-test.yml chore: add or force update .github/workflows/go-check.yml chore: add or force update .github/workflows/releaser.yml chore: add or force update .github/workflows/release-check.yml chore: add or force update .github/workflows/tagpush.yml chore: add or force update .github/workflows/go-test.yml chore: add or force update .github/workflows/go-check.yml chore: add or force update .github/workflows/releaser.yml chore: add or force update .github/workflows/release-check.yml chore: add or force update .github/workflows/tagpush.yml chore: add or force update .github/workflows/go-test.yml chore: add or force update .github/workflows/go-check.yml chore: add or force update .github/workflows/releaser.yml chore: add or force update .github/workflows/release-check.yml chore: add or force update .github/workflows/tagpush.yml findnode(self) now returns multiple peers don't lookup check if not enough peers in table enhanced tests moved rt size check after request Upgrade to go-log v2.5.1
This still does not do the fullrt client but it wasn't doing it before either.
Fixes smart dialler regression from #870.