Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swarm: Add FromFn ConnectionHandler #2852

Closed
wants to merge 37 commits into from

Conversation

thomaseizinger
Copy link
Contributor

@thomaseizinger thomaseizinger commented Aug 28, 2022

Description

The from_fn handler is a building block for describing protocols that have little or no connection-specific state. It encourages you to implement your protocol as a function of data without suspending the IO stream. This allows you to use regular async functions to express whatever message handshake you want.

To allow sharing of data between connections, from_fn uses an abstraction called Shared. It implements the observer pattern and pushes new versions of this data to all connections every time it gets updated. Each new stream (inbound or outbound) gets a copy of this state when created.

The stream handlers can either complete with a single event or emit several events as they progress. Thus, this abstraction should also help with the much requested "streaming response" protocol.

With from_fn, you still need to implement your own NetworkBehaviour. This is considered to be relatively easy though. Not having to implement a ConnectionHandler is a major step towards better usability.

Links to any relevant issues

Open Questions

  • I think we need a better name. Any suggestions?
    • async_fn_handler?
  • Should AutoNat also be converted?
  • Should identify also be converted?
  • Should Kademlia be converted? As long as can fit the DHT into memory, this would be possible.
  • What do we do with libp2p-request-response?

Open tasks

  • Propose ReadyUpgrade as separate PR: core/upgrade/: Add ReadyUpgrade #2855
  • Add a utility for notifying all handlers
  • Implement limit for max incoming substreams (const generic vs regular number?)
  • Implement limit for max pending dials?
  • Fix TODOs in code
  • Experiment with this handler in other protocols (ping? rendezvous? request-response?)

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

@thomaseizinger
Copy link
Contributor Author

One problem we currently have is that with NotifyHandler on NetworkBehaviourAction, we cannot notify all connections which makes it cumbersome to use the InEvent::UpdateState variant.

@mxinden @elenaf9 What do you think of adding NotifyHandler::All?

@thomaseizinger
Copy link
Contributor Author

thomaseizinger commented Aug 30, 2022

I looked at using this implementation for the ping protocol and I think it would be doable apart from I thing: Overriding keep_alive. The current ping ConnectionHandler allows the user the configure whether the connection should be kept alive. We could drop this functionality in favor of the DummyBehaviour (should we rename that one to StaticKeepAliveBehaviour to be easier to discover? I think overriding keep_alive is orthogonal to the ping behaviour.)

Thoughts?

@mxinden
Copy link
Member

mxinden commented Sep 2, 2022

I think overriding keep_alive is orthogonal to the ping behaviour.

Agreed. See #2778.

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to give this abstraction more thought. I find this abstraction to be very similar to libp2p-request-response. I have to think about whether it makes sense to have this next to libp2p-request-response, have this replace libp2p-request-response or have libp2p-request-response build on top of this.

swarm/src/handler/from_fn.rs Outdated Show resolved Hide resolved
swarm/src/handler/from_fn.rs Outdated Show resolved Hide resolved
@thomaseizinger
Copy link
Contributor Author

I have to give this abstraction more thought. I find this abstraction to be very similar to libp2p-request-response. I have to think about whether it makes sense to have this next to libp2p-request-response, have this replace libp2p-request-response or have libp2p-request-response build on top of this.

My preference would be to first see if we can successfully use this abstraction within our monorepo, e.g. have libp2p-request-response build on top of this.

In addition, I'd like to somehow survey people to see how they use it / whether they know about this abstraction. Perhaps more examples can also help with that.

Copy link
Contributor

@elenaf9 elenaf9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like the general idea.

I tried using this locally for rendezvous and we would have to use this mechanism to push the registration data from the behaviour down to the connections.

Did you manage to implement the rendezvous or any other protocol completely with this handler?

Is the name any good?

No strong opinion. Maybe FromUpgrade would describe it a bit better?

@mxinden @elenaf9 What do you think of adding NotifyHandler::All?

I don't see a reason not to add it, and updating the state of all handlers sounds like a valid use-case, so I'm in favor of it.

swarm/src/handler/from_fn.rs Show resolved Hide resolved
swarm/src/handler/from_fn.rs Outdated Show resolved Hide resolved
swarm/src/handler/from_fn.rs Outdated Show resolved Hide resolved
@thomaseizinger
Copy link
Contributor Author

@mxinden @elenaf9 What do you think of adding NotifyHandler::All?

I don't see a reason not to add it, and updating the state of all handlers sounds like a valid use-case, so I'm in favor of it.

Cool, I'll open a separate PR for it then :)

@thomaseizinger
Copy link
Contributor Author

I tried using this locally for rendezvous and we would have to use this mechanism to push the registration data from the behaviour down to the connections.

Did you manage to implement the rendezvous or any other protocol completely with this handler?

I don't have a working example yet:

  • Rendezvous is tricky because we kind of want NotifyHandler::All to continuously push the state down to each handler.
  • Ping is tricky because it currently exposes a "keep connection alive" mechanism that doesn't really have anything to do with ping. I've opened swarm: Split off "keep alive" functionality from DummyConnectionHandler #2859 as a first step of resolving this.
  • I haven't tried using it for libp2p-request-response but that one should be fairly easy I think.

@thomaseizinger thomaseizinger changed the title Add FromFn ConnectionHandler swarm: Add FromFn ConnectionHandler Sep 4, 2022
@mxinden
Copy link
Member

mxinden commented Sep 5, 2022

@mxinden @elenaf9 What do you think of adding NotifyHandler::All?

I don't see a reason not to add it, and updating the state of all handlers sounds like a valid use-case, so I'm in favor of it.

Cool, I'll open a separate PR for it then :)

Note that this would require ConnectionHandler::InEvent to be Clone. See #1880.

@thomaseizinger
Copy link
Contributor Author

@mxinden @elenaf9 What do you think of adding NotifyHandler::All?

I don't see a reason not to add it, and updating the state of all handlers sounds like a valid use-case, so I'm in favor of it.

Cool, I'll open a separate PR for it then :)

Note that this would require ConnectionHandler::InEvent to be Clone. See #1880.

Thanks for linking this. I guess we've come full circle. I was thinking that it will require Clone but I hadn't yet explored, what the implications of such a bound are. I was hoping to constrain it somehow to just that one variant but I think that is probably only possible with unsafe code.

To make this handler really useful, we'd need some kind of helper that sends an action to all connections.

@thomaseizinger
Copy link
Contributor Author

Another update on this:

I've introduced a builder pattern to builder the initial handler. This makes it much more convenient to configure a handler that doesn't accept inbound streams for example or will never make outbound streams. A lot of this is now also enforced in the type system.

I am gonna try to full convert libp2p-rendezous at some point and see how it works. For any protocol where the substream finishes eventually, I believe this is a great fit.

At the moment, we can't convert libp2p-ping because it reuses the substream for future pings. I might look into using Stream as an abstraction though instead of Future to support protocols that keep the substream open for longer.

@thomaseizinger
Copy link
Contributor Author

The latest patches introduce the notion of a Stream handler which allows us to implement protocols such as libp2p-ping that reuse their stream and emit events as they go along.

I think this is getting close to fulfilling the requirements of #2657.

Comment on lines 274 to 280
pub fn register_connection(&mut self, peer_id: PeerId, id: ConnectionId) {
self.connections.insert((peer_id, id));
}

pub fn unregister_connection(&mut self, peer_id: PeerId, id: ConnectionId) {
self.connections.remove(&(peer_id, id));
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once #2832 is fixed, we can make this more ergonomic by just requiring a reference to the FromSwarm event and we can pick out the data as we need.

@thomaseizinger
Copy link
Contributor Author

Rendezvous is now also fully migrated and functional.

Comment on lines 194 to 247
from_fn::OutEvent::InboundEmitted(Ok(Some(InboundOutEvent::NewRegistration(
new_registration,
)))) => {
let (registration, expiry) = self.registrations.add(new_registration);
self.next_expiry.push(expiry);

vec![NetworkBehaviourAction::GenerateEvent(
Event::PeerRegistered { peer, registration },
)]
}
from_fn::OutEvent::InboundEmitted(Ok(Some(InboundOutEvent::RegistrationFailed {
error,
namespace,
}))) => {
vec![NetworkBehaviourAction::GenerateEvent(
Event::PeerNotRegistered {
peer,
error,
namespace,
},
)]
}
from_fn::OutEvent::InboundEmitted(Ok(Some(InboundOutEvent::Discovered {
cookie,
registrations,
previous_registrations,
}))) => {
self.registrations
.cookies
.insert(cookie, previous_registrations);

vec![NetworkBehaviourAction::GenerateEvent(
Event::DiscoverServed {
enquirer: peer,
registrations,
},
)]
}
from_fn::OutEvent::InboundEmitted(Ok(Some(InboundOutEvent::DiscoverFailed {
error,
}))) => {
vec![NetworkBehaviourAction::GenerateEvent(
Event::DiscoverNotServed {
enquirer: peer,
error,
},
)]
}
from_fn::OutEvent::InboundEmitted(Ok(Some(InboundOutEvent::Unregister(namespace)))) => {
self.registrations.remove(namespace.clone(), peer);

vec![NetworkBehaviourAction::GenerateEvent(
Event::PeerUnregistered { peer, namespace },
)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of code here is only present to make things observable from the outside. We don't really need to emit all these events in order for the protocol to be functional.

@thomaseizinger thomaseizinger marked this pull request as ready for review November 16, 2022 13:43
@thomaseizinger
Copy link
Contributor Author

thomaseizinger commented Nov 16, 2022

I think this is ready for a proper review @mxinden @jxs @elenaf9!

I've updated the PR description.

Copy link
Contributor

@rkuhn rkuhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this does indeed look quite promising!

}
})
}
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this looks pretty neat!

pub fn with_inbound_handler<TInbound, TInboundStream>(
self,
max_inbound: usize,
handler: impl Fn(NegotiatedSubstream, PeerId, ConnectedPoint, Arc<TState>) -> TInboundStream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include the protocol name (also for outbound).

Copy link
Contributor Author

@thomaseizinger thomaseizinger Nov 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, there is only one protocol possible so it will always be that one if that function is called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, and I’d like to change that. All prefabricated upgrades today ignore the negotiated protocol name, which to my mind leaves the second most important value provided by libp2p on the table (the first one being the ability to dial a PeerId).

max_inbound: usize,
inbound_handler: Option<
Box<
dyn Fn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS this restricts the part of the handler that constructs the stream processor to shared access to its closure — is there a reason for this? If not I’d prefer FnMut.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason, I guess this could be useful if you want to track connection-local state across all inbound/outbound streams?

That would only be shared across one closure though, meaning you wouldn't be able to share state between the inbound and outbound factory function.

I think we might be able to change this interface to allow for a single factory function that gets called for both inbound and outbound streams which would allow you to share state. Is that what you would be after?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sharing state is not my concern because doing that always proceeds via shared references (of which Arc is a special case). Fn does not allow captured values to be modified (more precisely: mutably borrowed), even if they are owned.

Copy link
Contributor Author

@thomaseizinger thomaseizinger Nov 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fn does not allow captured values to be modified (more precisely: mutably borrowed), even if they are owned.

If you want to mutate a captured value, you are effectively sharing the state within a connection across all stream handler factory invocations.

I think we are saying the same thing and I agree that this capability is useful :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you mean “sharing along the timeline”, not “sharing in different places”. Yes, now we both know that we are in agreement :-)

/// state changes.
///
/// [`NetworkBehaviour`]: crate::NetworkBehaviour
pub fn from_fn(protocol: &'static str) -> Builder<WantState> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please allow multiple protocols, like

pub fn from_fn<I>(protocols: I) -> Builder<I, WantState>
where
    I: IntoIterator + Clone + Send + 'static
    I::Item: ProtocolName
{ ... }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is doable!

{
if self.keep_alive.is_yes() {
// TODO: Make timeout configurable
self.keep_alive = KeepAlive::Until(Instant::now() + Duration::from_secs(10))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels weird that this grace period is implemented on the level of ConnectionHandler — shouldn’t it be a NetworkBehaviour setting that applies whenever KeepAlive toggles from Yes to No?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not happy with this either but I couldn't think of a better way to handle this. keep_alive is per connection. My initial implementation would have been a with_keep_alive_timeout configuration parameter on the builder that defaults to 30s or something.

PeerId,
ConnectedPoint,
Arc<TState>,
) -> BoxStream<'static, TInbound>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d prefer type aliases for these function types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean exactly? As a user, you shouldn't need to come in contact this this BoxStream per se. Where would you like to see a type alias?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only mean the readability of this source module by itself.

.connections
.iter()
.map(|(peer_id, conn_id)| (*peer_id, *conn_id, self.shared.clone()))
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This erases previously undelivered wake-ups — which is fine unless it happens in a tight loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old events are useless though because they contain stale state. I am not sure if there is a better way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, old events are useless. Perhaps I am overthinking this, my worry is that there may be a situation where only the first (few) connections are ever notified because updates keep coming in — a fairness concern. This would be solved by randomising the notification order, or by always finishing one batch of notifications (with values updated during the process!) before starting the next. The second is deterministic but more work to implement.

Getting these things wrong usually has low chance of severe impact, but when the impact happens it is really difficult for downstream code to find the bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we currently drain events from the NetworkBehaviour before we inject new ones into it so I don't think that this would happen but I'd have to double check that.

@mergify
Copy link
Contributor

mergify bot commented Nov 17, 2022

This pull request has merge conflicts. Could you please resolve them @thomaseizinger? 🙏

@mergify
Copy link
Contributor

mergify bot commented Dec 14, 2022

This pull request has merge conflicts. Could you please resolve them @thomaseizinger? 🙏

@thomaseizinger
Copy link
Contributor Author

I think we can close this as won't-merge. It was an interesting exploration though!

@thomaseizinger thomaseizinger deleted the from-fn-connection-handler branch February 24, 2023 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants