-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(iroh-net): Optimise present nodes in ActiveRelay #2781
Conversation
These are two cleanups in the relay client: - The `relay::Client` hands out a connection object when asked to connect. This `Conn` was imported with rename to `RelayClient` which was a bit confusing as this was already the relay client. It is now renamed to `RelayConn` which makes a lot more sense. The related builder struct etc are renamed to match. - The `relay::Client` had a counter for the number of connections made to the relay. That seems fun, but was entirely unused. If this is a useful thing to have it should probably be a counter metric instead but let's not add anything that no one is using. Removing this makes a lot of APIs a bit simpler and removes some state tracking.
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/2781/docs/iroh/ Last updated: 2024-10-03T17:26:38Z |
fd3545c
to
8c594f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the original code and while the two sets were not exactly the same, I see no reason why this should work differently to the previous one. From my perspective this lgtm!
The ActiveRelay actor keeps track of which remote nodes are present on the relay connection so that we can optimise relay connections to remote nodes. This does two main optimisations: - There were two sets of these nodes kept, they could easily be unified. - The set is best stored in a BTreeSet since they are simple NodeIds stored in them. - Bonus: rename peer to node to match our naming convention. - Bonus: identify nodes by NodeId since this is a routing key here.
Co-authored-by: Divma <[email protected]>
8c594f5
to
275b399
Compare
## Description When the connection to the relay server fails the read channel will return a read error. At this point the ActiveRelay actor will passively wait until it has been asked to send something again before it will re-establish a connection. However if the local node has no reason to send anything to the relay server, the connection is never re-established. This is problematic when the relay has remote nodes trying to send to this node. This doubly problematic when the connection is to the home relay: the node just sits there thinking everything is healty and quiet, but no traffic is reaching it. In a node with active traffic this doesn't really show up, since a send will be triggered quickly for an active connection and the connection with the relay server would be re-established. The start of the ActiveRelay run loop is the right place for this. A read error triggers the loop to go round, logs a read error already and then re-estagblishes the connection. This does not keep the relay connection open forever. The mechanism that is cleans up unused connections to relay servers will still function correctly since this only takes the time something was last sent to a relay server into account. As long as a connection with a remote node exists there will be a DISCO ping between the two nodes over the relay path, so the connection is correctly kept alive. The home relay is exempted from the relay connection cleanup so is also kept connected, leaving this node available to be contacted via the relay server. Which is the entire point of this bugfix. The relay_client.is_connected() call sends a message to the relay Client actor, and relay_client.connect() does that again. Taking the shortcut to only call .connect() however is not better because the logging becomes messier. In the common case there is one roundrip-message to the relay Client actor and this would not improve anyway. The two messages for the case where a reconnect is needed does not occur commonly. ## Breaking Changes None ## Notes & open questions Fixes fishfolk/bones#428 It is rather difficult to test though. This targets #2781 as base. ## Change checklist - [x] Self-review. - ~~[ ] Documentation updates following the [style guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text), if relevant.~~ - ~~[ ] Tests if relevant.~~ - ~~[ ] All breaking changes documented.~~
Description
The ActiveRelay actor keeps track of which remote nodes are present on the relay connection so that we can optimise relay connections to remote nodes. This does two main optimisations:
There were two sets of these nodes kept, they could easily be unified.
The set is best stored in a BTreeSet since they are simple NodeIds stored in them.
Bonus: rename peer to node to match our naming convention.
Bonus: identify nodes by NodeId since this is a routing key here.
Breaking Changes
Still none if all is well.
Notes & open questions
This targets #2779 as base.
Change checklist
[ ] Tests if relevant.[ ] All breaking changes documented.