The building blocks of `rust-libp2p` #3603

thomaseizinger · 2023-03-13T09:16:53Z

thomaseizinger
Mar 13, 2023
Collaborator

In this post, I want to summarize and outline a vision I have been forming around what the internal design of rust-libp2p should be. It is based on many experience reports and from working with the library for the past couple of years.

Initially, it was motivated by coming up with a design for what I call "custom protocols" but I soon realized that what I actually want to document are the layers and components I see that eventually compose to the various features we have today.

To begin with, we need to look at the constraints of what we are working with:

A Swarm has a set of behaviours. Those can be seen as vertical slices of functionality. Vertical because they don't "see" each other but can operate across the entire stack, i.e. take input from the user and manage individual connections.
Each NetworkBehaviour operates as a single instance. Each connection is given a ConnectionHandler. Those again are vertically composed. Each NetworkBehaviour produces one ConnectionHandler for each connection. A multiplexing component allows those ConnectionHandlers to operate across the same physical connection.
Each of those connections run on a separate task within a runtime.
NetworkBehaviours and ConnectionHandlers communicate with each other via message passing.

These abstractions are guided by the physical underpinnings of our domain: networking. In my opinion, it would be a mistake to try and hide some of these details. For example, spawning a new ConnectionHandler per connection makes it easy for users to incorporate the lifecycle of a connection into their protocol. There are some simplifications to be made to these interfaces, for example:

However, none of these question the base abstraction of splitting protocols into a long-lived component (NetworkBehavior) and one per physical connection (ConnectionHandler). The following vision is based on the idea that these abstractions are correct.

Not all protocols need everything provided by these abstractions. That is where our users perceive a lot of boilerplate. If all I need is to send a request and receive a response, I don't really care which connection it is sent over. Having to pass a message down across 2 layers and the response back up is unnecessary. We have libp2p-request-response but that implements a very specific usecase. If what you are doing doesn't fit this usecase, you are presented with nothing again.

What we need is a set of components that gradually compose to the functionality we want, minimising boilerplate at each step of the way. Thus, if whatever component we provide does not fit the user's usecase, they can go and replace it with something else.

For the benefit of these usecases and also to reduce code duplication within our own protocols, I see the following stack of components:

Use asynchronous-codec to go from streams to messages: A ConnectionHandler presents us with a stream that implements AsyncRead and AsyncWrite. For most protocols, we will want to decode these into messages. We already have the asynchronous-codec crate which helps us with this. It is however purely declarative. In other words, using the crate doesn't actually give you messages, it gives you a Stream and a Sink that will yield messages when polled.
Use message patterns to turn a "framed" stream into a protocol: A message pattern is a specific handshake, like sending a message and waiting for a response or sending a single message and closing the stream afterwards. A message pattern takes the form of a state machine that needs to be "driven". I have a proposal for extending asynchronous-codec with message patterns in Prototype message patterns: RecvSend, SendRecv and Send mxinden/asynchronous-codec#5.
Use TimedStreams to bound IO: Operations need to have timeouts, otherwise bad things will happen, like hostile peers deliberately hogging our resources. I envision TimedStreams to be a wrapper around SelectAll that applies a timeout to each item added to the list. In case the stream (which will be a "message pattern" in the majority of cases) does not finish within the specified timeout, it will automatically be aborted (by dropping it).

Notice how:

Message patterns are generic over their codec, meaning users can plug in whatever serialization they want without duplicating the handshake itself
TimedStreams is entirely optional composition step and not strictly mandatory

Now we can go from streams to sending messages back and forth but so far, we haven't dealt with ConnectionHandler and NetworkBehaviour yet. The above components will reduce some boilerplate (no more manual state machines) but to actually have a working protocol, users will still need to implement a NetworkBehaviour and a ConnectionHandler incl. the message passing between them which isn't exactly fun.

Here is where we have several choices. Somehow, we need to give users access to streams and "cut" through the layers provided by NetworkBehaviour and ConnectionHandler. Several proposals have been made in the past:

This is where I am still undecided. Essentially, it boils down to: Do we want "streams" to escape a ConnectionHandler (and thus the task they are running on) or not?

If we are okay with this, then something like the proposed libp2p-bare-stream is the way to go. It gives users access to streams and they can do with them what they want, either compose them with any of the abstractions above or do something entirely different.

However, I think there is also a point to be made about containing networking logic within a ConnectionHandler. In that case, something like a FromFn ConnectionHandler is much more appropriate. It is easy to extend this to a NetworkBehaviour if we want to where users provide the callback directly and don't have to implement any trait themselves.

Should we do both and let users decide? That does feel a bit lazy too. It seems like we should use our knowledge of the library to provide a "here is a good way of doing things" approach and not give users multiple choices, essentially forcing them again to get into the details to understand the trade-offs.

rkuhn · 2023-03-13T10:07:34Z

rkuhn
Mar 13, 2023

Thanks for the clear presentation! I fully agree with your assumption that the split into long-lived and per-connection pieces is correct.

One way to think about what to do with streams on a connection is that there are two domains: the network is streaming octets (one after the other) while the host is synchronous and structural (we have everything present at the same time, to inspect it at our leisure). With this definition, the network boundary is where the stream is consumed (which may mean turning it into a structure or sending it elsewhere again). Now the basic classification of use-cases obviously permits two options:

the network boundary is inside the Swarm; in this case I assume our recommendation is to place it in the ConnectionHandler
the network boundary is outside of the Swarm; in this case we need to ferry buckets of bytes to some other code over an async boundary — this obviously needs to implement back-pressure

In the former case, a received message may be sent to an interested actor inside the libp2p-based application, or it may be handled by the behaviour (either in NetworkBehaviour or ConnectionHandler) to send a response. Triggering messages to be sent is done via behaviour APIs (which may be methods or channels).

In the latter case, the consumer of an incoming stream or the producer of an outgoing stream lives outside the Swarm, so the behaviour will have APIs that expose channels for sending and receiving the buckets of bytes.

So to me it seams that we might want to expose the following APIs for behaviour authors:

FromFn connection handler for when the main focus is a per-Swarm behaviour
a NetworkBehaviour factory taking a connection handler and offering to the end-user an API for sending commands in via the Swarm and receiving messages from the network (or handler) via a channel (no need to poll these through the NetworkBehaviour)
like above but with a channel for sending further inputs into the connection handler to allow for streaming data to a peer

Options 2 and 3 can obviously be combined, people may simply drop the stream sending channel. The experience for end user code would then be: “I want to initiate a session of protocol X with PeerId P and here are the parameters, let me know when it’s done.” More complex cases could easily have multiple steps where intermediate results are provided to the caller who is then expected to send back some decision on how to continue.

The streaming data use-case is also covered by simply making the messages exchanged with code outside the Swarm be buckets of bytes (plus metadata as applicable). This would be a totally awesome API and dev experience, I think.

6 replies

rkuhn Mar 14, 2023

Assume I have a generic request/streaming response connection handler and want to add a protocol where one peer can ask another for updates on their CPU temperature.

// assume (de)serialisable
struct TemperatureRequest { ... };
struct TemperatureUpdate { ... };

let (conn_factory, rx_requests) = FramedStreamHandler::new::<TemperatureRequest, TemperatureUpdate>(
        "/temp/v1", // protocol name
        100, // capacity of the incoming requests channel (will drop stream if congested)
    );
// spawn task receiving from rx_requests and adding response channels so that they get updates
let temperaturs_behaviour = RequestStreamBehaviour::new(conn_factory);

#[derive(NetworkBehaviour)]
struct NB {
    ping: ping::Behaviour,
    temperatures: RequestStreamBehaviour<TemperatureRequest, TemperatureUpdate>,
    ...
}

// then somewhere else
let req: TemperatureRequest = ...
let mut rx_updates = swarm.behaviour_mut().temperatures.request(peer_id, req);
while let Some(update: TemperatureUpdate) = rx_updates.next().await {
    // do something with this temperature update from the peer
}

thomaseizinger Mar 14, 2023
Collaborator Author

Okay, if I understand correctly then what you are suggesting is to not use the message passing infrastructure we have but return channels upon construction.

I.e we don't even use the outevents to pass up the channels.

Is there something this handler "does" for you?

What I'd like to move away from is having multiple, opinionated building blocks that all work a bit differently. Instead, the above two approaches seem to be what it distills down to:

Hand out just the streams
Accept a stream handler closure

rkuhn Mar 15, 2023

In this discussion I presume nothing, I start from the requirements and desires of the stakeholders: protocol and swarm authors as well as swarm users. A swarm user may use the swarm for

inquiring about information held in the swarm (e.g. about peers)
instructing the swarm to do something

In the second case, the interaction might be synchronous, fire and forget, or it might produce a ACK/NACK later. But the second case also includes the wish of starting a protocol session with a peer, by which I mean a non-trivial high-level application protocol — I consider anything with more messages than request–response as non-trivial. A contemporary example would be dialling a peer that hosts ChatGPT and chatting with it until the session is ended by either party.

Polling the swarm to get all sorts of events apart from the session messages we desire sounds like a really terrible interface for the swarm user. Establishing bidirectional communication with our ongoing session (which may need Swarm knowledge or be located at the ConnectionHandler layer) sounds much more enticing to me.

An alternative approach would be composing the poll loop for the swarm in parallel to composing the NetworkBehaviour, but that feels just silly, since the only reason would be the dogma of funnelling everything through the poll interface. When moving to stateless #[derive(NetworkBehaviour)] a few versions ago I had to create a lot of boilerplate code to do exactly that.

I don’t fully understand your question whether “this handler does something for me”. The behaviour I want to interact with can be arbitrarily complex, both on the Swarm and per-connection levels. There need not be any easily discernible correlation between application requests/responses and network messages. Handing out the raw streams is just a very narrow special case of this, and framing/parsing is a slightly more elaborated one. More complex might be that I interact with an IPLD service at the Swarm level that internally uses DHT/bitswap/gossip/whatever to find the bytes I need as I follow links in my session. And capturing this session has value because it may use caching/prefetching on the particular content I’m interested in right now.

thomaseizinger Mar 15, 2023
Collaborator Author

Meta-note: Thank you for working through this with me, your input is extremely valuable!

Polling the swarm to get all sorts of events apart from the session messages we desire sounds like a really terrible interface for the swarm

I think you are hitting the nail on the head here. There is no inherent reason for why we should treat events about the entire network the same as events about a generic application protocol.

In my view the NetworkBehaviour abstraction works really well if you your networking logic can easily be contained in a module. ping and identify are nice examples of this, so is rendezvous.

I think there is value in trying to design protocols such that they are contained. Short-lived streams/network communication is good in my opinion. The network is inherently unreliable, so designing your application state with this in mind makes for a good user experience. For example, I'd say that capturing user input first, saving it locally and transmitting it in the background is better than doing it all in one go and losing the user's input in case the transfer fails. It may be a bit of a stretch but one thing that I am worried about is that by handing out streams, it is quite easy to do the latter because a stream is an object you can just pass around like a value despite it being directly tied to fallible IO.

For me, the above reasoning tips the scales slightly towards offering a FromFn style behaviour. A user can still emit the stream if they desire but there is a slight push into the direction of containing the protocol within the closure, thus encouraging short-lived network communication.

Handing out streams also has a back-pressure problem whereas containing them within the ConnectionHandler allows us to limit how many we want to process. If course, if the user escapes this by emitting the stream as a value from the closure all is lost but in that case, it would at least be an explicit action and not "our fault" as library authors.

An alternative approach would be composing the poll loop for the swarm in parallel to composing the NetworkBehaviour, but that feels just silly, since the only reason would be the dogma of funnelling everything through the poll interface. When moving to stateless #[derive(NetworkBehaviour)] a few versions ago I had to create a lot of boilerplate code to do exactly that.

I think there is value in consistency but it is a means to an end. Your proposal above has ergonomic appeals but I am a bit worried that

a) it is a completely different style again
b) it is opinionated and might again be sub-optimal for usecases we haven't discovered yet

I think we are in agreement that interacting with a Swarm should not be necessary for every step of a protocol interaction. If we were to move forward with a FromFn style behaviour, building what you outlined above wouldn't be too difficult. I imagine the FromFn behaviour to be initalized with two closures (one for inbound, one for outbound streams), so you could basically build the following:

let (tx_inbound_streams, rx_inbound_streams = channel();

let behaviour = FromFn::new(
	"/temp/v1",
	|peer, stream| async move { tx_inbound_streams.send((peer, stream)).await }, // exact syntax might differ depending on how we design the closure
	|peer, stream, oneshot| async move { oneshot.send(stream)}
);

// ...

#[derive(NetworkBehaviour)]
struct NB {
    ping: ping::Behaviour,
    temperatures: FromFn,
    ...
}

// ...

let (tx, rx) = oneshot();
swarm.behaviour_mut().temperatures.open_outbound(peer, tx);

let stream = rx.await;

// send messages on stream

I am not opposed to including something like this in our workspace but I'd like to see it in action first somewhere.

rkuhn Mar 15, 2023

Maybe I’m missing something: why does it matter where the stream is used? What kind of back-pressure is affected?

Your FromFn conflates the swarm and connection layers, would it not be more modular, reusable, and intuitive to keep them (as in my proposal)?

“it is opinionated and might again be sub-optimal for usecases we haven't discovered yet” — I think this objection is invalid, it doesn’t make sense to not provide an API because we don’t yet know all the things users may want to do at some later time. In my experience, an API must be extremely opinionated in order to be properly understood, and then we’ll see whether programmers use it with success or fail. And then we may add another API to provide a solution for those failures.

lu-zero · 2023-04-07T16:49:50Z

lu-zero
Apr 7, 2023

If you try to port the chat example to use stream::Merge instead of select!, you quickly see the Stream for Swarm isn't exactly great since pretty much everything else is sync and you do not have a straightforward way to concurrently send a message while receiving events.

1 reply

thomaseizinger Jun 27, 2023
Collaborator Author

That is because a Swarm is not just a Stream. A Stream you typically only read from but you'd also "write" to a Swarm.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The building blocks of `rust-libp2p` #3603

{{title}}

Replies: 2 comments 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

The building blocks of rust-libp2p #3603

thomaseizinger Mar 13, 2023 Collaborator

Replies: 2 comments · 7 replies

rkuhn Mar 13, 2023

rkuhn Mar 14, 2023

thomaseizinger Mar 14, 2023 Collaborator Author

rkuhn Mar 15, 2023

thomaseizinger Mar 15, 2023 Collaborator Author

rkuhn Mar 15, 2023

lu-zero Apr 7, 2023

thomaseizinger Jun 27, 2023 Collaborator Author

The building blocks of `rust-libp2p` #3603

thomaseizinger
Mar 13, 2023
Collaborator

Replies: 2 comments 7 replies

rkuhn
Mar 13, 2023

thomaseizinger Mar 14, 2023
Collaborator Author

thomaseizinger Mar 15, 2023
Collaborator Author

lu-zero
Apr 7, 2023

thomaseizinger Jun 27, 2023
Collaborator Author