Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(perf): add (provision, build, run) tooling #163

Merged
merged 102 commits into from
May 26, 2023

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Apr 4, 2023

Continued in #184.

@mxinden mxinden requested review from MarcoPolo and removed request for MarcoPolo April 4, 2023 14:24
@mxinden
Copy link
Member Author

mxinden commented Apr 6, 2023

Test run between server in us-west-1 and client in us-east-1:

Note that this is still using t2.micro machines, in other words these numbers are not representative.

➜  perf git:(perf-terraform) ✗ ssh ec2-user@$(terraform output -raw client_public_ip) sudo docker run --tty --rm --entrypoint perf-client mxinden/libp2p-perf --server-address /ip4/$(terraform output -raw server_public_ip)/tcp/4001

[2023-04-06T18:34:17Z INFO  perf_client] Start benchmark: round-trip-time latency
[2023-04-06T18:34:48Z INFO  perf_client] Finished: 56 pings in 30.3030s
[2023-04-06T18:34:48Z INFO  perf_client] - 0.5399 s median
[2023-04-06T18:34:48Z INFO  perf_client] - 0.5499 s 95th percentile
    
[2023-04-06T18:34:48Z INFO  perf_client] Start benchmark: single connection single channel throughput
[2023-04-06T18:35:15Z INFO  perf_client] Finished: sent 10.00 MiB in 13.56 s and received 10.00 MiB in 13.66 s
[2023-04-06T18:35:15Z INFO  perf_client] - 5.90 MiBit/s up
[2023-04-06T18:35:15Z INFO  perf_client] - 5.86 MiBit/s down
    
[2023-04-06T18:35:15Z INFO  perf_client] Start benchmark: single connection parallel requests per second
[2023-04-06T18:35:16Z INFO  perf_client] Finished: sent 1000 1 bytes requests with 1 bytes response each within 0.40 s
[2023-04-06T18:35:16Z INFO  perf_client] - 2493.58 req/s
    
[2023-04-06T18:35:16Z INFO  perf_client] Start benchmark: sequential connections with single request per second
[2023-04-06T18:35:47Z INFO  perf_client] Finished: established 39 connections with one 1 bytes request and one 1 bytes response within 30.81 s
[2023-04-06T18:35:47Z INFO  perf_client] - 0.2488 s 95th percentile connection establishment
[2023-04-06T18:35:47Z INFO  perf_client] - 0.8114 s 95th percentile connection establishment + one request

{"benchmarks":[{"comparisons":[],"name":"Single Connection throughput – Upload","results":[{"implementation":"rust-libp2p","result":773013.3502842287,"transportStack":"TODO","version":"TODO"}],"unit":"bits/s"},{"comparisons":[],"name":"Single Connection 1 byte round trip latency 95th percentile","results":[{"implementation":"rust-libp2p","result":0.811350517,"transportStack":"TODO","version":"TODO"}],"unit":"s"}]}

@mxinden
Copy link
Member Author

mxinden commented Apr 8, 2023

Recent commits add a runner implementation, running a Rust Perf docker image on the AWS client once with TCP, once with QUIC:

➜  runner git:(perf-terraform) npm run start -- --client-public-ip $(terraform output -raw -state ../terraform/terraform.tfstate client_public_ip) --server-public-ip $(terraform output -raw -state ../terraform/terraform.tfstate server_public_ip)

> [email protected] start
> node ./dist/index.js --client-public-ip xxx --server-public-ip xxx

[2023-04-08T18:42:55Z INFO  perf_client] Start benchmark: custom
[2023-04-08T18:43:02Z INFO  perf_client] Finished: Established 10 connections uploading 1 and download 1 bytes each
[2023-04-08T18:43:07Z INFO  perf_client] Start benchmark: custom
[2023-04-08T18:43:09Z INFO  perf_client] Finished: Established 10 connections uploading 1 and download 1 bytes each
{
  "benchmarks": [
    {
      "name": "Single Connection 1 byte round trip latency",
      "unit": "s",
      "results": [
        {
          "result": [
            0.541189669,
            0.543760197,
            0.554063168,
            0.544573805,
            0.535035136,
            0.538494264,
            0.539836617,
            0.536488084,
            0.538661487,
            0.542860167
          ],
          "implementation": "",
          "version": "rust-master",
          "transportStack": "tcp"
        },
        {
          "result": [
            0.122755473,
            0.120762102,
            0.122967167,
            0.120803698,
            0.124939367,
            0.123426268,
            0.121066374,
            0.120640572,
            0.121048847,
            0.123058823
          ],
          "implementation": "",
          "version": "rust-master",
          "transportStack": "quic-v1"
        }
      ],
      "comparisons": []
    }
  ]
}

@mxinden mxinden changed the title feat: add basic provision script for client and server perf feat: add provision script and runner Apr 8, 2023
@mxinden mxinden changed the title feat: add provision script and runner feat(perf): add provision script and runner Apr 8, 2023
@mxinden
Copy link
Member Author

mxinden commented Apr 11, 2023

Status Update:

  • Terraform provision script
    • is feature complete
    • needs to move away from t2.micro to large machine types, ideally with its own network card.
  • Runner
    • Has base functionality
    • Requires binaries to support following flags
        --server-address <SERVER_ADDRESS>  
        --upload-bytes <UPLOAD_BYTES>      
        --download-bytes <DOWNLOAD_BYTES>  
        --n-times <N_TIMES>                
      
    • Does not yet support --parallel-connections flag. I suggest we deprioritize this in favor of first adding the other languages.
    • Needs minor touches to support JS and Go
    • Runs download, upload and connection establishment latency benchmark
    • Prints agreed schema (with one minor caveat, instead of percentiles, include all datapoints)
  • Binaries

@MarcoPolo
Copy link
Contributor

Very cool!

perf/runner/src/index.ts Outdated Show resolved Hide resolved
perf/runner/src/index.ts Outdated Show resolved Hide resolved
@MarcoPolo
Copy link
Contributor

Where is the rust implementation?

@mxinden
Copy link
Member Author

mxinden commented Apr 11, 2023

Where is the rust implementation?

Rust implementation is here:

libp2p/rust-libp2p#3646

@MarcoPolo
Copy link
Contributor

Pushed a Go implementation, could you try it out?

I'm curious if we'll see the 2023/04/12 04:27:38 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Receive-Buffer-Size for details. message. That would be a problem.

- Makes emited values self-describing.
- Ensures accounting of connection establishment.
- Allows differentiation in subsequent data analysis.
@BigLep
Copy link
Contributor

BigLep commented May 19, 2023

@mxinden:

Move to TLS only (for now). Easier to compare with HTTPs and no additional security protocol negotiation.

Agreed we should have something very comparable to HTTPs, but can we also add other security protocol negotiation so we can see how much that impacts performance?

Use TLS only instead of both TLS and Noise. Removes the additional
multistream-select security protocol negotiation. Thus makes it easier to
compare with TCP+TLS+HTTP/2
Makes the JSON easier to read.
Copy link
Member Author

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoPolo @marten-seemann as discussed before I added patches to the go-libp2p perf implementation to not wait for the libp2p identify exchange when (1) establishing the connection and (2) opening the perf stream.

There is likely a more go-libp2p idiomatic way of doing this. Feedback welcome.

Comment on lines +85 to +93
start := time.Now()
h.Peerstore().AddAddrs(serverInfo.ID, serverInfo.Addrs, peerstore.TempAddrTTL)
// Use h.Network().DialPeer() instead of h.Connect to skip waiting for
// identify protocol to finish.
_, err = h.Network().DialPeer(context.Background(), serverInfo.ID)
if err != nil {
panic(err)
}
connectionEstablished := time.Since(start)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipping the identify wait in Connect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marten-seemann @MarcoPolo @sukunrt any idea why the connection establishment with the code above takes 3 network round-trips? I would expect it to take 2 RTT.

My assumption:

  • 1 RTT for TCP.
  • I only enable TLS, thus no security protocol negotiation, more specifically optimistic 0RTT multistream-select.
  • 1 RTT for TLS. (Assuming TLS 1.3 is the default.)
  • Using default muxers, which should be only Yamux, thus no muxer negotiation, more specifically optimistic 0RTT multistream-select.
  • Not waiting for the additional identify exchange.

Copy link
Member

@sukunrt sukunrt May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go-libp2p will spend 1RTT negotiating the security protocol.
go-libp2p doesn't do 0RTT select for security protocol. It does 0RTT select for other protocols when it knows the peer supports the given protocol otherwise it'll still do the 1RTT select.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go-libp2p will spend 1RTT negotiating the security protocol.

Given that the client only supports one security protocol, i.e. TLS, shouldn't it be using the lazy mechanism, in other words optimistically use TLS @sukunrt?

https://github.com/multiformats/go-multistream/blob/master/lazyClient.go

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can change this. But currently this is not what's happening.
https://github.com/libp2p/go-libp2p/blob/master/p2p/net/upgrader/upgrader.go#L319-L330
In security negotiation we don't use lazyClient.
It uses this function which does wait for the reply. https://github.com/multiformats/go-multistream/blob/master/client.go#L117

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @sukunrt. I forgot that go-libp2p supports the multistream-select simultaneous-open extension and thus can not do an optimistic multistream-select negotiation when negotiating a single protocol (here TLS) only.

The extension is not needed in libp2p's hole punching as roles (i.e. dialer and listener) are derived from the roles on the relayed connection, see DCUtR protocol section. As far as I can tell the simultaneous-open extension is only needed for the case where two nodes randomly connect via TCP at the same time.

In my eyes the probability for two nodes to randomly, i.e. not coordinated via DCUtR, connect via TCP at the same time is low and only leads to a connection failure. Thus we don't support it in rust-libp2p.

@sukunrt @MarcoPolo @marten-seemann what do you think of removing the simultaneous-open extension in go-libp2p and thus enabling the optimistic security protocol negotiation when negotiation a single protocol only? This would allow go-libp2p to be able to compete with vanilla TCP+TLS+HTTP when it comes to connection establishment latency. In other words both go-lib2p with TLS+Yamux and vanilla TCP+TLS+HTTP would take 3 RTT (2 RTT connection establishment + 1 RTT request/response).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think we should make code changes just to look better in benchmarks. In practice, most configurations (IPFS and Lotus) use TLS and Noise.

@vyzo had numbers on simultaneous opens happening in the wild. Would you mind sharing them here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, in nim-libp2p, we don't bind when we are public, which avoids uncoordinated simultaneous opens in the wild

We don't have enough data yet to see if it causes issues, but so far the only scenario where it doesn't work is if someone made a manual port mapping to a different port than the internal one, and didn't specify it to libp2p. In this case, we can't discover our public port, and we think we are private (since we don't bind)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the input everyone.

I documented the above as well as past discussions in libp2p/go-libp2p#2330. Let's continue the discussion over there in case there is interest.

Comment on lines +99 to +110
// Use ps.Host.Network().NewStream() instead of ps.Host.NewStream() to
// skip waiting for identify protocol to finish.
s, err := ps.Host.Network().NewStream(network.WithNoDial(ctx, "already dialed"), p)
if err != nil {
return 0, 0, err
}
s.SetProtocol(ID)
lzcon := msmux.NewMSSelect(s, ID)
s = &streamWrapper{
Stream: s,
rw: lzcon,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipping the identify wait in NewStream.

@mxinden
Copy link
Member Author

mxinden commented May 24, 2023

The above brings us one step closer to reaching https latency performance. The vanilla Go https implementation needs 3RTT, both rust-libp2p and with the above now also go-libp2p with TCP need 4 RTT.
newplot

See https://observablehq.com/@mxinden-workspace/libp2p-perf#cell-621.

@mxinden
Copy link
Member Author

mxinden commented May 25, 2023

@mxinden:

Move to TLS only (for now). Easier to compare with HTTPs and no additional security protocol negotiation.

Agreed we should have something very comparable to HTTPs, but can we also add other security protocol negotiation so we can see how much that impacts performance?

We can. I suggest doing so in future iterations. Added to the list of potential next steps in the pull request description.

@mxinden mxinden changed the title feat(perf): add performance benchmarking feat(perf): add tooling (provision, build, run) May 25, 2023
@mxinden mxinden changed the title feat(perf): add tooling (provision, build, run) feat(perf): add (provision, build, run) tooling May 25, 2023
@mxinden mxinden changed the base branch from master to perf May 26, 2023 02:00
@mxinden mxinden marked this pull request as ready for review May 26, 2023 02:00
@mxinden mxinden merged commit e77a6d5 into libp2p:perf May 26, 2023
@mxinden
Copy link
Member Author

mxinden commented May 26, 2023

I changed the base of the pull request to the perf branch on the upstream libp2p/test-plans repository. Will ease collaboration with @galargh on mxinden#2.

Let's continue on #184.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants