Refactor `Connection` to a synchronous state machine #142

thomaseizinger · 2022-10-08T12:20:59Z

This PR refactors Connection to a synchronous state machine.

It retains the Control API but layers it completely on top of the existing Connection. This allows us to do this refactoring without touching any of the tests. In a future step, we can port all the tests to the new poll-based API and potentially remove the Control API.

All commits:

Are properly formatted
Have no clippy warnings
Pass the test suite

It should be possible to review this PR patch-by-patch. You may find it easier though to first have a look at the end-result by checking out the code and navigating around to see how things work now.

It would be great if we could merge #145 first. That would allow us to share a few bits between the two test suites.

mxinden · 2022-10-17T18:09:31Z

I am sorry for the delay here. I won't get to reviewing this until after libp2p day.

Remove Control API?

In my eyes, that would be a great simplification.

thomaseizinger · 2022-10-18T07:35:35Z

I am sorry for the delay here. I won't get to reviewing this until after libp2p day.

No worries.

Remove Control API?

In my eyes, that would be a great simplification.

Cool!

I am thinking of making a separate PR that adds more tests so I can refactor with a bit more confidence.

Once that is merged, we could perhaps also remove the control API in this PR? It would make some of the internals vastly simpler. Plus, anyone can build a control style API on top of one with multiple poll functions at any point.

thomaseizinger · 2022-10-25T07:00:11Z

On top of being (hopefully) easier to understand, the new implementation proves to be a lot more performant too!

For all slow networks, there is no change but as we get to more high-speed networks, we are seeing improvements of up to 38%!

We can do this by having a centralised place to send messages and shoving them into this buffer in all other places.

To reduce the duplication we split out a test harness.

This allows us to have a proper place for our test-harness.

thomaseizinger · 2022-11-03T01:18:42Z

I've rebased on top of master to allow for an easier patch-by-patch review!

thomaseizinger · 2022-11-03T01:19:39Z

Let me know if you disagree with the workspace structure.

mxinden

Direction looks good to me. Couple of comments, nothing big.

In my eyes this pull request combines many unrelated changes. Thus ideally this would be split into many pull requests. That said, I think it is fine to proceed here, especially as this repository is not very active and thus conflicts are not probable.

yamux/src/lib.rs

yamux/src/connection/cleanup.rs

src/connection.rs

mxinden · 2022-11-12T11:41:45Z

src/connection.rs

+            match self.socket.poll_flush_unpin(cx)? {
+                Poll::Ready(()) => {}
+                Poll::Pending => {}
+            }


I am wondering whether we should flush so early in the poll method, or whether it shouldn't be one of the last actions. Rational being that frequent flushing hurts performance, especially in case one can increase the batch instead.

Just a thought. Needs more thought and potentially data to back it up.

I couldn't find any consistent performance improvement in my benchmarks when moving this block of code up or down.

However, this got me thinking: We do we communicate via channels between the connection and the stream for writing but we use shared buffers when reading? We could just as easily have a buffer of frames in Shared and wake the Connection whenever we write any frames to that. This would allow us to drain the buffer of all streams in one go, without having to receive individual frames over a channel.

I couldn't find any consistent performance improvement in my benchmarks when moving this block of code up or down.

Thanks for testing. Let's keep as is.

However, this got me thinking: We do we communicate via channels between the connection and the stream for writing but we use shared buffers when reading? We could just as easily have a buffer of frames in Shared and wake the Connection whenever we write any frames to that. This would allow us to drain the buffer of all streams in one go, without having to receive individual frames over a channel.

I am decisive whether the connection should communicate with the stream via a channel or via plain Mutex and Waker. Whatever change we want to make, I think it should not happen within this pull request.

src/connection.rs

src/pause.rs

mxinden · 2022-11-12T11:54:23Z

src/connection/closing.rs

+/// A [`Future`] that gracefully closes the yamux connection.
+#[must_use]
+pub struct Closing<T> {
+    state: State,
+    control_receiver: mpsc::Receiver<ControlCommand>,
+    stream_receiver: mpsc::Receiver<StreamCommand>,
+    pending_frames: VecDeque<Frame<()>>,
+    socket: Fuse<frame::Io<T>>,
+}


Question not suggestion: Why deliberately implement this as a state machine instead of a procedural async/await?

So that it can be named without boxing it up.

What would be the drawback of boxing it?

What would be the drawback of boxing it?

Performance

We have to decide whether we add the Send bound. In the current design, we get to delete the YamuxLocal stuff in rust-libp2p because the Send bound is inferred.

I don't feel strongly about either but it felt like a nice improvement as I went along. Once we get "impl Trait in type-alias" in Rust at least the boxing would go away.

What would be the drawback of boxing it?

* Performance

Is there any benchmark proving this? Is performance relevant when closing a connection?

We have to decide whether we add the Send bound. In the current design, we get to delete the YamuxLocal stuff in rust-libp2p because the Send bound is inferred.

The infectious-send-when-boxing problem is reason enough to not box in my eyes 👍

thomaseizinger · 2022-11-12T15:17:44Z

In my eyes this pull request combines many unrelated changes. Thus ideally this would be split into many pull requests. That said, I think it is fine to proceed here, especially as this repository is not very active and thus conflicts are not probable.

I agree that the size is not ideal. I did however find it quite difficult to refactor this piece-wise into something that can be merged independently without leaving master in a weird state from a design perspective.

Best I could do was to make small commits but I don't think a particular subset of those is worth merging independently :)

mxinden

It retains the Control API but layers it completely on top of the existing Connection. This allows us to do this refactoring without touching any of the tests. In a future step, we can port all the tests to the new poll-based API and potentially remove the Control API.

I suggest we deprecate or remove the Control API in a follow-up pull request. What do you think @thomaseizinger?

This is a large change potentially resulting in subtle changes in behavior breaking upper layers. What is the best strategy to test this patch in the wild? I suggest we ask community members to run this in their production environments. Should we cut an alpha release for it, or rather have them test based on a hash?

thomaseizinger · 2022-11-17T05:04:48Z

It retains the Control API but layers it completely on top of the existing Connection. This allows us to do this refactoring without touching any of the tests. In a future step, we can port all the tests to the new poll-based API and potentially remove the Control API.

I suggest we deprecate or remove the Control API in a follow-up pull request. What do you think @thomaseizinger?

Yes, the tests need refactoring before we can remove it.

This is a large change potentially resulting in subtle changes in behavior breaking upper layers. What is the best strategy to test this patch in the wild? I suggest we ask community members to run this in their production environments. Should we cut an alpha release for it, or rather have them test based on a hash?

I'd suggest:

Merge this
Release yamux
Update libp2p-yamux on a branch based on the 0.49 release
Cut an alpha for libp2p-yamux
Have people drop that alpha into their dependency-tree. As long as they are on 0.49, that should make things compatible and there are no behaviour changes other than this patch.

thomaseizinger · 2022-11-23T01:13:31Z

I've bumped the version and changelog. I am intending to merge this in the upcoming days.

mxinden · 2022-11-24T19:57:14Z

I'd suggest:

1. Merge this

2. Release `yamux`

3. Update `libp2p-yamux` on a branch based on the 0.49 release

4. Cut an alpha for `libp2p-yamux`

5. Have people drop that alpha into their dependency-tree. As long as they are on 0.49, that should make things compatible and there are no behaviour changes other than this patch.

Sounds good to me.

I've bumped the version and changelog. I am intending to merge this in the upcoming days.

👍 I can cut a release right after.

thomaseizinger requested review from elenaf9 and mxinden October 8, 2022 12:21

thomaseizinger force-pushed the poll-style branch from bbd360a to e41e838 Compare October 12, 2022 01:54

thomaseizinger changed the title ~~Refactor towards poll-style code~~ Refactor Connection to a synchronous state machine Oct 12, 2022

This was referenced Oct 12, 2022

deps(yamux): update yamux to v0.12 libp2p/rust-libp2p#3013

Merged

[Tracking issue]: Uniform substream behaviour and interface libp2p/rust-libp2p#3014

Closed

This was referenced Oct 18, 2022

Swarm does not honour max_negotiating_inbound_streams setting libp2p/rust-libp2p#3041

Closed

Use type state pattern #143

Closed

Refactor tests #145

Merged

Simple client/server example blocks forever #144

Closed

thomaseizinger force-pushed the poll-style branch from 272ddf9 to 33c7683 Compare October 25, 2022 05:05

thomaseizinger marked this pull request as draft October 25, 2022 05:13

thomaseizinger force-pushed the poll-style branch 2 times, most recently from 9500fb2 to 598a0aa Compare October 25, 2022 06:57

thomaseizinger force-pushed the poll-style branch from 598a0aa to b54df4c Compare October 25, 2022 07:16

thomaseizinger marked this pull request as ready for review October 25, 2022 07:17

thomaseizinger added 11 commits November 3, 2022 10:40

Fix clippy warnings

9bd4a34

Remove unnecessary Option

9af35b8

Make gargabe_collect not async

54296c5

Handle socket closing outside of on_control_command

6234e6f

Handle GoAway logic outside of on_stream_command

d5d5ef4

Remove async from a bunch of functions

6a55933

We can do this by having a centralised place to send messages and shoving them into this buffer in all other places.

Introduce Connection::poll function

c1e805d

Minimise diff

e02bdbc

Split on_control_command

3a6a374

Use whitespace

5e64609

Split on_stream_command

3817398

thomaseizinger added 12 commits November 3, 2022 10:42

Add docs to state variants

19ee6e8

Track reply sender outside of close function

370f5bc

Handle ControlCommand outside of ConnectionState::poll

7dcecde

Introduce Frame::close_stream ctor

fe7d000

Move Drop impl from Active to Connection

2bae656

Implement connection closing as manual state machine

47c684b

Implement connection cleanup as manual state machine

8279c27

Don't require T to be Send + 'static'

ce57e25

Rewrite Control to be a layer on top of Connection

0ac90a0

Make control as sister module of connection

79c1479

Add test for poll-based API

3941cdc

To reduce the duplication we split out a test harness.

Create workspace

a0ba23b

This allows us to have a proper place for our test-harness.

thomaseizinger force-pushed the poll-style branch from b54df4c to a0ba23b Compare November 3, 2022 01:17

mxinden reviewed Nov 12, 2022

View reviewed changes

thomaseizinger added 4 commits November 14, 2022 15:06

Remove comment and improve variable naming

654e38a

Fix typo

1525cf8

Match exhaustively

27ed7ac

Use doc link

1ca32e6

thomaseizinger mentioned this pull request Nov 15, 2022

Proposal: Back-pressure on opening new streams on yamux libp2p/specs#471

Closed

thomaseizinger requested a review from mxinden November 15, 2022 05:22

mxinden approved these changes Nov 16, 2022

View reviewed changes

Bump version and add changelog entry

63938b1

mxinden approved these changes Nov 24, 2022

View reviewed changes

thomaseizinger merged commit e59d8a5 into libp2p:master Nov 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor `Connection` to a synchronous state machine #142

Refactor `Connection` to a synchronous state machine #142

thomaseizinger commented Oct 8, 2022 •

edited

Loading

mxinden commented Oct 17, 2022

thomaseizinger commented Oct 18, 2022

thomaseizinger commented Oct 25, 2022

thomaseizinger commented Nov 3, 2022

thomaseizinger commented Nov 3, 2022

mxinden left a comment

mxinden Nov 12, 2022

thomaseizinger Nov 14, 2022

mxinden Nov 15, 2022

mxinden Nov 12, 2022

thomaseizinger Nov 12, 2022

mxinden Nov 15, 2022

thomaseizinger Nov 15, 2022

mxinden Nov 16, 2022

thomaseizinger commented Nov 12, 2022

mxinden left a comment

thomaseizinger commented Nov 17, 2022

thomaseizinger commented Nov 23, 2022

mxinden commented Nov 24, 2022

Refactor Connection to a synchronous state machine #142

Refactor Connection to a synchronous state machine #142

Conversation

thomaseizinger commented Oct 8, 2022 • edited Loading

mxinden commented Oct 17, 2022

thomaseizinger commented Oct 18, 2022

thomaseizinger commented Oct 25, 2022

thomaseizinger commented Nov 3, 2022

thomaseizinger commented Nov 3, 2022

mxinden left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomaseizinger commented Nov 12, 2022

mxinden left a comment

Choose a reason for hiding this comment

thomaseizinger commented Nov 17, 2022

thomaseizinger commented Nov 23, 2022

mxinden commented Nov 24, 2022

Refactor `Connection` to a synchronous state machine #142

Refactor `Connection` to a synchronous state machine #142

thomaseizinger commented Oct 8, 2022 •

edited

Loading