-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
protocols/stream: Add a streaming-response protocol to rust-libp2p #2657
Comments
Thanks for kicking this off @b5. I'm also willing to help with planning, and potentially contribute some code and testing. |
We (Actyx) have used our own implementation based on OneShot requests in our control plane (e.g. the admin accessing the node for settings and debugging). This turned out to be a performance bottleneck for high rate data transfers (e.g. dump/restore of a node’s whole data), which led me to try my hand at a protocol based on persistent substreams. Results for our use-case are very promising, but it is not yet battle-hardened (other features of higher priority intervened). You can find the current code in this gist. The API is as simple as I could make it:
What do you think? |
That sounds reasonable to me. Something consider: The API on |
The shape of this mechanism will probably depend on the underlying implementation: currently, there is only a fire-and-forget So the question is: where and when can the client side determine that backpressure needs to be exerted for a given request? As a caller, I’d expect some queueing between myself and the NIC, possibly with synchronous error while buffers are full — like for a bounded queue. Implementing this based on the current libp2p interfaces is extremely difficult, though. One common theme is that messages are shovelled from one queue into another in many places without a notion of finite capacity. Should this error be asynchronous, then? This would make it harder to use the API correctly, and it would still allow an unbounded number of requests to be dumped into the behaviour with the expectation of getting some kind of reaction later. Which would point towards there being synchronous as well as asynchronous error cases — not nice. With such a scheme the caller will also need a means to register for a notification once capacity becomes available. I agree that it is desirable to solve these issues, but I also have some experience with how difficult it is to do so, cf. Reactive Streams. Such a change is pervasive in a programming language ecosystem since it needs to be supported by several libraries in order to become useful. It also requires code to be written differently, so that e.g. requests can be pulled out of a source at an adequate rate — this doesn’t mesh well with imperative procedural style of the kind “lemme find X in the DHT, then publish messages A, B, C and then send request R”. Perhaps I’m overthinking this and the answer is “just return |
I had an idea that could perhaps solve this issue. I am calling it Here is the tl;dr:
impl BareStreamBehaviour {
pub fn new_outbound_stream(&mut self) -> BareStreamId;
}
enum BareStreamEvent {
NewOutboundStream {
stream: NegotiatedSubstream,
id: BareStreamId
},
NewInboundStream {
stream: NegotiatedSubstream
}
} This would replace Expanding on this: The current abstractions within
Implementing a request-response protocol is trivial once you have something that implements My proposal would be to introduce the In summary: For usecases where the protocol at hand can be implemented in a contained way, the recommendation to people would be to implement Thoughts? |
Agreed with this mindset. I am in favor of offering "the ultimate escape hatch".
I don't think this is sophisticated enough.
Something along the lines of:
Where |
Yes, this simplification is much better, great idea @thomaseizinger! @mxinden Returning The following would be nice because it pushes the state storage onto the caller, but unfortunately it violates the aliasing rules: pub async fn new_outbound_stream(&mut self, remote_peer: PeerId) -> BareStreamId (The issue is that the no other method can be called on the behaviour while awaiting the result.) So perhaps the best we can do is to keep the knowledge of queue sizes available in the behaviour so that we can return an immediate result: pub fn new_outbound_stream(&mut self, remote_peer: PeerId) -> Result<BareStreamId, Backpressure> Then the caller will need to wait for a |
Oh, one more thing: it would be really great if the Not this sounds like a really cool improvement over TCP sockets to me! |
I've had a similar thought yes but left it out from the initial proposal so we can focus on the core idea :) Ideally, we make these protocols type-safe as well (an enum or something) instead of strings. |
I am tempted to say no. The method could fail immediately if there is no active connection to the peer for example. In my experience, the policy on which peers a node should be connected to is extremely application-specific. We are talking about usecases here which don't fit into the typical p2p setting (hence the bare-stream escape hatch), hence the chances of an atypical connection-policy is high. I'd put out a recommendation to users to implement their own NetworkBehaviour which encapsulates their connection policy, for example, always maintaining a connection to certain nodes. |
I am not sure I follow. If
Agreed. I don't think an
Note that I am proposing to enforce backpressure on negotiating (to be established) substreams. Once negotiated the life-cycle management of a substream is up to the user. I think the user should learn when to poll again through the standard Also in regards to the return type, is this not the same to
The more I think about it, the more I like the framing. TCP with multiplexing, security, hole punching, discovery, ...
Fine not including this at all, or at least not in an initial design. |
How would this go together in practise? If I see a function that returns If I use |
This violates the |
We could implement the Behaviour to simply wait internally for a "slot" to establish a new substream. i.e. return an ID immediately but the event with the substream would only be emitted with respect to backpressure limits. We would also need a timeout then I think! |
The |
I wrote up my thoughts on a peer-to-peer internet, which would be my motivation for wanting the currently discussed feature — and I obviously wouldn’t consider it a “ultimate escape hatch” but instead the true purpose of libp2p :-) But let’s take one step at a time. Considering unreliable datagrams: I’d still want the PeerId/ProtocolName addressing and negotiation, but instead of AsyncRead/Write the result would need to support sending and receiving datagrams. The latter should not be swarm events (since live video streaming is one of the intended use-cases), so a NegotiatedDatagramChannel might be needed. This is very much future work since such a thing only becomes useful once we have a datagram transport we can put this onto, which poses further questions regarding sharing a cryptographic context with some (signaling) streams. Regarding the waker: this tool feels more and more like a really blunt instrument to me, since wake-ups cannot be traced back to their cause for efficient processing. Future/Poll/Waker is a nice package for a task that makes linear progress, but not a nice tool for running a hierarchical NetworkBehaviour. Hence I would prefer an emitted event for notifying the calling code of the end of a congestion condition. |
Sorry for OT but this pun ("framing") is too good to be left unaddressed 😁 |
to throw a use case out there, I wanted to implement a simple p2p proxy, but couldn't find an obvious way to do it using libp2p. Note that this goes beyond "single request+streaming response" and instead needs a bidirectional stream interface. |
I think the idea from above would solve this? |
Any progress on this? I have to download video files from the closest nodes and streams are much needed. |
You already have streams as of today, you just need to implement your own I've somewhat changed my mind on this idea since I wrote it. The problem is that once we had out bare streams from I think in the long-run, we will be better off if we make it easier for users to implement
My suggestion would be: Bite the bullet and work through the Feel free to open a new issues about specific problems that you have whilst doing so. There is a fair bit of documentation already but I also think that is can always be improved :) Footnotes
|
@thomaseizinger thanks for the reply. I don't know In one form or another, streams should IMO be implemented like general HTTP2 or QUIC streams where major "concerns" are resolved by flow-control. A stream would probably be a transparent "channel" on top of a lower-level protocol. Since QUIC is coming to Rust I guess relatively soon #2883 (comment), #2801 (comment) I wonder, how will Finally, writting your own Anyhow, I guess I'll have to dig deeper into it and see what's possible. |
Thank you for your thoughts! A transport like QUIC is indeed somewhat similar to what We do several things on top of it though:
The split into
From an application PoV, nothing will change and all
We are working on making it easier :) Note that, the individual streams that are given to a
I totally understand that these abstractions can feel overwhelming. I've been there myself and was utterly confused initially :) |
I've been building something in #2852. There might still be a few rough edges and the documentation needs more work but I'd welcome feedback on the abstraction introduced there. It introduces an abstraction for declaring I already implemented |
While exploring some ideas and getting to know libp2p internals better, I ended up implementing a |
Thank you for building this! I would like to note that whilst such a design looks appealing (after all, I was an advocate myself), it also comes with downsides. Most importantly, any protocol built on top of it will likely not benefit from any work we are doing in I see this as a workaround for us not yet providing a convenient enough interface to implement a If I can ask, what do the protocols look like that you are building @Frando? |
One thing I'm looking into at the moment is a simple tar stream between two peers. Usecase is account transfer for Delta Chat - there's a proof of concept using a full Iroh node plus bitswap, but this is quite heavy for just sending a single (streaming) tar file between two peers, thus I was looking into how a very simple "just a tar stream" solution could look like. For this, I looked into how to access the muxer streams directly, and in the process wrote the PoC crate linked above.
Interesting. I was under the impression that backpressure would be handled on the |
Right. You could use
It is a bit more subtle than that (see #3078 for details). For example, in the future we don't just want backpressure on a stream level but also on the number of streams that a remote can open. If you are just handing every stream "out" of the behaviour, you'll have to rebuild whatever mechanism we come up with :) That is just one example. Don't get me wrong, I totally understand why the escape-hatch is needed currently. We are just not working on it because I think there is more value in making our abstractions better rather than not using them! |
I think looking at it holistically, we have the following choices when it comes to making people's life easier for writing their own protocols:
The introduction of The work around decoupling the inbound/outbound upgrade falls into (2). Things like (2) and (3) are useful but are not going to cut it long-term. Having to implement your own On the other hand, What if instead of an entire
Implementing a @mxinden A request-response handler could also be a good way of creating a convention among protocols how they should interact. |
I also had similar requirements to build a port forwarding application over libp2p rust. Build a proxy server that tunnels/relay all internet traffic from one peer to another one. Able to tweak the Gossipsub topics to multiplex multiple socket connection data over a topic & demultiplex it at other side. Here is the link to working code: |
This is likely a bad idea because you are layering flow-control algorithms on top of each other. In regards to this particular feature, there have been many proposals and none of them quite made it into master simply because there are so many different needs to consider. @mxinden and I discussed this recently and we agreed on a compromise: Have an example that showcases how users can build a simple "escape-hatch" This would fill two needs:
Contributions welcome! |
I am going to close this in favor of #4457 now. Thank you all for the discussion and input! |
Description
Add support for streaming multiple responses, instead of a 1-1 request-response model in rust-libp2p.
Motivation
Requirements
Open questions
Would this constitute a change to the libp2p spec? Interop story across implementations is an open question. Our use case only requires rust support this feature.
Are you planning to do it yourself in a pull request?
Maybe. We've taken a stab at it, and may be able to help in a joint effort starting sometime after July 20th. I'd be happy to help with planning & coordination in the meantime.
The text was updated successfully, but these errors were encountered: