Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(swarm)!: allow NetworkBehaviours to manage connections #3254

Merged
merged 199 commits into from
Feb 23, 2023

Conversation

thomaseizinger
Copy link
Contributor

@thomaseizinger thomaseizinger commented Dec 16, 2022

Description

Previously, a ConnectionHandler was immediately requested from the NetworkBehaviour as soon as a new dial was initiated or a new incoming connection accepted.

With this patch, we delay the creation of the handler until the connection is actually established and fully upgraded, i.e authenticated and multiplexed.

As a consequence, NetworkBehaviour::new_handler is now deprecated in favor of a new set of callbacks:

  • NetworkBehaviour::handle_pending_inbound_connection
  • NetworkBehaviour::handle_pending_outbound_connection
  • NetworkBehaviour::handle_established_inbound_connection
  • NetworkBehaviour::handle_established_outbound_connection

All callbacks are fallible, allowing the NetworkBehaviour to abort the connection either immediately or after it is fully established. All callbacks also receive a ConnectionId parameter which uniquely identifies the connection. For example, in case a NetworkBehaviour issues a dial via NetworkBehaviourAction::Dial, it can unambiguously detect this dial in these lifecycle callbacks via the ConnectionId.

Finally, NetworkBehaviour::handle_pending_outbound_connection also replaces NetworkBehaviour::addresses_of_peer by allowing the behaviour to return more addresses to be used for the dial.

Resolves #2824.

Notes

This is a variant of #3099 with less breaking changes and more deprecations.

Links to any relevant issues

Open Questions

  • The ConnectionId is currently generated through a random number generator. This is not ideal and could in theory lead to collisions:
    • Should we use a UUID instead? I think those are less likely to produce a collision.
    • Should we manage the connection IDs centrally, i.e. in the Swarm? This would require us to provide a facility to the NetworkBehaviour in how it can generate a new one to ensure it can track state for its own dial. We could use PollParameters for that. One downside of this is a slight ergonomic hit because the user can only fully construct a NetworkBehaviourAction::Dial within NetworkBehaviour::poll and not within the other callbacks. Some of our protocols already do that via an ActionBuilder pattern.

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

@thomaseizinger
Copy link
Contributor Author

* The `ConnectionId` is currently generated through a random number generator. This is not ideal and could in theory lead to collisions:
  
  * Should we use a UUID instead? I think those are less likely to produce a collision.
  * Should we manage the connection IDs centrally, i.e. in the `Swarm`? This would require us to provide a facility to the `NetworkBehaviour` in how it can generate a new one to ensure it can track state for its own dial. We could use `PollParameters` for that. One downside of this is a slight ergonomic hit because the user can only fully construct a `NetworkBehaviourAction::Dial` within `NetworkBehaviour::poll` and not within the other callbacks. Some of our protocols already do that via an `ActionBuilder` pattern. The interface could look something like:
pub trait PollParameters {
    fn new_dial(&mut self, opts: DialOpts) -> NetworkBehaviourAction<X, Y>;
}

Use a static Atomic counter instead. That gives us the best of both worlds!

@thomaseizinger
Copy link
Contributor Author

Before we merge this, are we happy with the handle_ naming of the new callbacks? It is going to be a pain to rename these later so I want to make sure we are all happy with the naming.

@mxinden
Copy link
Member

mxinden commented Feb 22, 2023

Before we merge this, are we happy with the handle_ naming of the new callbacks? It is going to be a pain to rename these later so I want to make sure we are all happy with the naming.

I am in favor of handle_. It matches with the return of a ConnectionHandler. By not using on_ it is contrasted to the pure event handlers that in contrast don't return anything.

@thomaseizinger thomaseizinger added this to the v0.51.0 milestone Feb 22, 2023
Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready for merge from my end. Please add the send-it label once you want this merged @thomaseizinger. Including it in v0.51.0 sounds good to me.

@mergify
Copy link
Contributor

mergify bot commented Feb 23, 2023

This pull request has merge conflicts. Could you please resolve them @thomaseizinger? 🙏

@mergify mergify bot merged commit 19a5549 into master Feb 23, 2023
@mergify mergify bot deleted the 2824-deprecate-into-connection-handler branch February 23, 2023 23:43
@jxs
Copy link
Member

jxs commented Feb 25, 2023

Sorry I missed this.

Before we merge this, are we happy with the handle_ naming of the new callbacks? It is going to be a pain to rename these later so I want to make sure we are all happy with the naming.

Why not keep the on_* naming and have something like on_upgrade_event with Enum variants as the other on_* methods?

@thomaseizinger
Copy link
Contributor Author

Before we merge this, are we happy with the handle_ naming of the new callbacks? It is going to be a pain to rename these later so I want to make sure we are all happy with the naming.

Why not keep the on_* naming and have something like on_upgrade_event with Enum variants as the other on_* methods?

In general, events typically don't have return values, they just state what happened. Event handlers are thus often named with on_. I'd find it weird to have an event handler that actually returns value.

@melekes
Copy link
Contributor

melekes commented Apr 4, 2023

Finally, NetworkBehaviour::handle_pending_outbound_connection also replaces NetworkBehaviour::addresses_of_peer by allowing the behaviour to return more addresses to be used for the dial.

Q: what's the preferred way of getting a list of known addresses for a peer? Before 0.51 we're using addresses_of_peer in our RPC to print a list of addresses (and other info) https://github.com/paritytech/substrate/blob/c69c0a61215724334a1b8a2ec4dfbf82bb4bb6e4/client/network/src/service.rs#L519. Using handle_pending_outbound_connection seems weird in that case since we're not going to open any outbound connections, we just need a list of addresses https://github.com/paritytech/substrate/blob/c69c0a61215724334a1b8a2ec4dfbf82bb4bb6e4/client/network/src/service.rs#L529.

@thomaseizinger
Copy link
Contributor Author

Finally, NetworkBehaviour::handle_pending_outbound_connection also replaces NetworkBehaviour::addresses_of_peer by allowing the behaviour to return more addresses to be used for the dial.

Q: what's the preferred way of getting a list of known addresses for a peer? Before 0.51 we're using addresses_of_peer in our RPC to print a list of addresses (and other info) paritytech/substrate@c69c0a6/client/network/src/service.rs#L519. Using handle_pending_outbound_connection seems weird in that case since we're not going to open any outbound connections, we just need a list of addresses paritytech/substrate@c69c0a6/client/network/src/service.rs#L529.

See also #3633.

I consider all functions of NetworkBehaviour to be "internal" in the sense that they should not be called by anyone other than composing NetworkBehaviour implementations or Swarm.

At the moment, there is no replacement for your usecase. Which behaviours do you need to gather addresses from? I am open to add public functions to e.g. mdns or other "discovery" behaviours that expose this information. I don't think we should add a direct replacement function to NetworkBehaviour. Once the above functions are exposed, you should be able to create your own abstraction if you desire one.

@melekes
Copy link
Contributor

melekes commented Apr 6, 2023

I consider all functions of NetworkBehaviour to be "internal" in the sense that they should not be called by anyone other than composing NetworkBehaviour implementations or Swarm.

Makes sense.

Which behaviours do you need to gather addresses from? I am open to add public functions to e.g. mdns or other "discovery" behaviours that expose this information.

mdns, kad.

Once the above functions are exposed, you should be able to create your own abstraction if you desire one.

Sounds great 👍

@thomaseizinger
Copy link
Contributor Author

Which behaviours do you need to gather addresses from? I am open to add public functions to e.g. mdns or other "discovery" behaviours that expose this information.

mdns, kad.

Are you open to send a PR for mdns? I think kad already exposes the necessary functionality, see linked ticket above.

@melekes
Copy link
Contributor

melekes commented Apr 6, 2023

Are you open to send a PR for mdns?

sure

umgefahren pushed a commit to umgefahren/rust-libp2p that referenced this pull request Mar 8, 2024
libp2p#3328)

We create the `ConnectionId` for the new connection as part of `DialOpts`. This allows `NetworkBehaviour`s to accurately track state regarding their own dial attempts.

This patch is the main enabler of libp2p#3254. Removing the `handler` field will allow us to deprecate the `NetworkBehaviour::new_handler` function in favor of four new ones that give more control over the connection lifecycle.
umgefahren pushed a commit to umgefahren/rust-libp2p that referenced this pull request Mar 8, 2024
Previously, a `ConnectionHandler` was immediately requested from the `NetworkBehaviour` as soon as a new dial was initiated or a new incoming connection accepted.

With this patch, we delay the creation of the handler until the connection is actually established and fully upgraded, i.e authenticated and multiplexed.

As a consequence, `NetworkBehaviour::new_handler` is now deprecated in favor of a new set of callbacks:

- `NetworkBehaviour::handle_pending_inbound_connection`
- `NetworkBehaviour::handle_pending_outbound_connection`
- `NetworkBehaviour::handle_established_inbound_connection`
- `NetworkBehaviour::handle_established_outbound_connection`

All callbacks are fallible, allowing the `NetworkBehaviour` to abort the connection either immediately or after it is fully established. All callbacks also receive a `ConnectionId` parameter which uniquely identifies the connection. For example, in case a `NetworkBehaviour` issues a dial via `NetworkBehaviourAction::Dial`, it can unambiguously detect this dial in these lifecycle callbacks via the `ConnectionId`.

Finally, `NetworkBehaviour::handle_pending_outbound_connection` also replaces `NetworkBehaviour::addresses_of_peer` by allowing the behaviour to return more addresses to be used for the dial.

Resolves libp2p#2824.

Pull-Request: libp2p#3254.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

swarm/: Support generic connection management
6 participants