EPIC: support multi-dimensional testing #53

laurentsenta · 2022-10-10T17:05:41Z

As a libp2p maintainer, I want the ability to define test suites that combine many implementations, many muxers, many transports, etc. Defining and running these test suites should be simple, and the outcome should be clear. It should be easy to trigger these test suites before a release. It should be easy to display these results in a readable form on a website.

eta: 2022Q4

Tasks

Create a "simple" interop dashboard #55
- this will add the tooling used to describe, build, and run multiple combinations of tests in Testground
Implement new parameters (muxers, transports, etc)
- we need to define what the matrix looks like
- we need to update the ping test to support these parameters
  - ⚠️ we need to make sure we test the connection from A to B AND from B to A (we don't at the moment)
add the "expected RTT" test
- This requires the ability to "generate" the expected outcomes for every cell in this N-D matrix,
[tracking issue] Interop test-plans for all existing/developing libp2p transports #61
Generate a "nice" dashboard Canonical interop tests & dashboard #62
- @BigLep shared some notes in Create a "simple" interop dashboard #55 (comment)
- use case 1: As a maintainer, I want to check the status of our interop testing before making a release.
- use case 2: As a user, I want to know more about libp2p interoperability capabilities.

Follow-up tasks

"optimize" by skipping versions libraries that are not used anymore (see notes below),
"optimize" using artifact caching,

Description

A high-level approach:

first, we use versions resources file to generate "complex" test matrixes,
then we use another resource (data or code) to produce the expected RTT matrix,
then we generate the relevant composition file (as shown below), with the expected RTT as a parameter,
then we iterate through these test cases and call testground run.

Configurations

Ideally, the libp2p team provides a resource file that contains the versions and their features:

[[groups]]
# go v0.42
GoVersion = '1.18'
Modfile = "go.v0.22.mod"
Selector = 'v0.42'
Implementation = 'go'
SupportedTransports = ["tcp", "quic", "webrtc"]
Muxer = ["yamux"]

[[groups]]
# go v0.22
GoVersion = '1.18'
Modfile = "go.v0.22.mod"
Selector = 'v0.22'
Implementation = 'go'
SupportedTransports = ["tcp", "quic"]
Muxer = ["mplex"]


[[groups]]
# rust v0.51
Libp2pVersion = 'v0.51.0'
Implementation = 'rust'
SupportedTransports = ["tcp", "webrtc"]

[[groups]]
# rust v0.47.0
Libp2pVersion = 'v0.47.0'
Implementation = 'rust'
SupportedTransports = ["tcp"]
Muxer = ["yamux", "mplex"]

And we'll create some way to have "meta-compositions" that can describe multiple tests and run many pairs together,

something like
(pseucode)

{ for every group }
    { if !group.SupportedTransports contains ENV.TESTED_TRANSPORT }
        { continue }
    { end }

    { if !group.SupportedTransports contains ENV.TESTED_MUXER }
        { continue }
    { end }

    [testground_instance]
    {}
{ endfor }

Called with

ENV.TESTED_TRANSPORT = "tcp"
ENV.TESTED_MUXER = "yamux"
testground run composition file

Related discussions and issues

The text was updated successfully, but these errors were encountered:

laurentsenta · 2022-10-10T17:05:52Z

cc @julian88110 @tinydb @mxinden @galargh

laurentsenta · 2022-10-11T09:27:05Z

note from sync w/ @marten-seemann (he was away for the chat yesterday and had related requests).

@mxinden @marten-seemann I have no strong feelings about this, but we might lose information between impromptu meetings. Do you feel the need to schedule some sort of "interop squad sync" where everyone joins at the same time?

Some more good ideas were brought up. A few notes:

Generating Matrix Parameters

The point of this epic is to let the libp2p team define a multi-dimensional test matrix,
and make sure it's maintainable. This is covered by the configuration above.

Using this configuration file, we can generate multiple tests, for example:

every pair that can communicate over quic.
every group of versions that can use TCP + yamux.

Implicitly this configuration represents compatible pairs. This means we have a matrix of implementation * versions * muxer * transports and each cell in this matrix is YES (interoperable) or NO (not interoperable).

An extension to this matrix is the RTT use case: @marten-seemann needs to add "expected RTTs" for each of these pairs. Each cell in the matrix will contain the number of RTTs ~~an expected RTT in milliseconds~~.

A high-level approach:

first, we use the versions resources file (above) to generate the test matrix,
then we use another resource (data or code) to produce the expected RTT matrix,
then we generate the relevant composition file (as shown above), with the expected RTT as a parameter,
then we iterate through these test cases and call testground run.

I plan to iterate through solutions:

first, implement an interop suite that runs in CI and test pairs of instances, and generate "some" dashboards.
second, extend this feature to support the full matrix of muxer * transports * etc...
third, extends this feature to support the RTT matrix.

Unknown: this is focused on generating test pairs, how to express "I want to create a group with every webrtc-compatible implementation"? Do we need it?

Supported versions

The size of the test matrix will explode quickly. This should not be a problem to create the configuration files, but this will be a problem for the time to build & run test suites.

I believe EKS will solve some of this problem (with caching out of the box and having enough resources to enable parallelization), but we'll have to pick and choose which versions and combinations to test eventually.

One option is to test the most used version, see the diagram:

http://162.55.187.75:3000/d/CSQsORs7k/nebula?orgId=1&viewPanel=11
We have ~10 versions that cover ~60% of the network + "others"

We can use these metrics to enable/disable versions, probably something like:

X latest versions,
X most used versions

This solution sounds reasonable enough to keep this problem for later.

Config Structure

keep in mind: muxer + security applies only to TCP, so it might be helpful to use a nested structure instead of:

conf:
  SupportedTransports = ["tcp", "quic", "webrtc"]
  Muxer = ["yamux"]

Something like this might represent our matrix better, and it might be easier to expand.

conf:
   transports:
       TCP:
          muxer: yamux
       quic: true

We don't have to decide on this now (data is easy to transform), I recommend we stick to the flat structure for now.

marten-seemann · 2022-10-11T09:37:37Z

An extension to this matrix is the RTT use case: @marten-seemann needs to add "expected RTT" for each of these pairs. Each cell in the matrix will contain an expected RTT in milliseconds.

Minor correction: It’s expected handshake duration, and it’s a dimensionless number: the number of RTTs. Calculating the actual duration will be done by the test plan, based on that number and the RTT that’s used for the run.

This is tracked on the [libp2p test-plans](libp2p/test-plans#44) see also libp2p/test-plans#53.

mxinden · 2022-10-12T19:58:50Z

keep in mind: muxer + security applies only to TCP, so it might be helpful to use a nested structure instead of:

We will face the same problem on the dimension of multiplexer-negotiation (via multistream-select or via security protocol) as this is only relevant for TCP.

I think expressing hierarchies within dimensions (e.g. tcp/noise/mplex, tcp/noise/yamux) is a valid option.

mxinden · 2022-10-12T20:00:31Z

@mxinden @marten-seemann I have no strong feelings about this, but we might lose information between impromptu meetings. Do you feel the need to schedule some sort of "interop squad sync" where everyone joins at the same time?

I would suggest to continue doing these ad-hoc. That said, this is not a strong opinion.

mxinden · 2022-10-12T20:02:29Z

Unknown: this is focused on generating test pairs, how to express "I want to create a group with every webrtc-compatible implementation"? Do we need it?

While long-term we will need this, I do think we should focus on point-to-point testing for now, i.e. two nodes (potentially 3 including a relay node for WebRTC browser-to-browser) instead of a group of nodes.

julian88110 · 2022-10-13T22:57:00Z

I have created a google doc as a way to capture our test case requirements, filled in with the info I know, please take a look and if you can, fill in the part especially regarding JS, rust. We can migrate this doc to github once it is in a good shape. https://docs.google.com/document/d/1-akPPFW7kko9SkpedxXOV2foWJhnliELGd1WKn-0RFw/edit#

John-LittleBearLabs · 2022-10-14T14:55:33Z

I plan to iterate through solutions:

Does this mean that you're working on this, @laurentsenta ?

then we iterate through these test cases and call testground run.

... & ...

first, implement an interop suite that runs in CI and test pairs of instances, and generate "some" dashboards.

I had thought the matrix would be a single composition with some elaborate template structure. But it sounds like you're thinking of writing a higher-level script that orchestrates calls into testground?

If so, could/should this sort of matrix instead be a new feature added to testground itself, for reuse?

laurentsenta · 2022-10-17T08:07:28Z

@John-LittleBearLabs thanks for raising these questions,

Does this mean that you're working on this, @laurentsenta ?

I am working on the first step here: #55 which should add support for matrixes. I moved the list of tasks to the top of the issue description. @julian88110 is also working on the test matrix definition.

I had thought the matrix would be a single composition with some elaborate template structure. But it sounds like you're thinking of writing a higher-level script that orchestrates calls into testground?

You need both: with the composition, we can describe "many" test plans using templates and env parameters.
But we also need a way to run many compositions and gather their outcomes.

If so, could/should this sort of matrix instead be a new feature added to testground itself for reuse?

You're correct, we can and we should.
I shared a solution in #55 (comment).

Ideally, we validate this solution with the team and iterate on it for a while, then we can split:

testground support in EPIC: Implement multiple runs per compositions testground/testground#1493
interop usage in this ticket.

The hard problem is making interop matrix maintainable, we can implement "anything" in testground as long as we don't leak interop-related matters into Testground APIs.

laurentsenta · 2022-10-17T12:37:19Z

(updated the task description with a clearer definition and added a few steps)

julian88110 · 2022-10-19T21:29:15Z

#Testground multi-dimensional test matrix

Tests are to be composed from the information extracted out of the resource files.

An example resource file entry may look like this:

[[groups]]


# go v0.42


GoVersion = '1.18'


Modfile = "go.v0.22.mod"


Selector = 'v0.42'


Implementation = 'go'


SupportedTransports = ["tcp", "quic", "webrtc"]


SupportedSecurityProtos = [“tls”, “noise”]


SupportedMuxers = ["yamux", “mplex”]

A test peer/host is customized by the following parameters:

testHost = Host(implementation, version, transport, securityProto, supportedMuxers)

A test case is composed by two or more test hosts:

testInstance = TestInstance(testHost-1, testHost2, …)

Go transport list:

  Go-libp2p-transports = [“TCP”, “QUIC”, “Webtransport”, “Websocket”]

Rust transport list:

  Rust-libp2p-transports = [“TCP”, “WebRTC”]

JS transport list:

 JS-libp2p-transports[“ToDo”]

Test Matrix

Test matrix for libp2p multi dimensional tests. (Test cases should also be run with source/destination flipped)

Test case	Source Host					Run Test	Destination Host					Expected Res
	Imp	Ver	Trans	Sec	Muxs	Run Test	Imp	Ver	Trans	Sec	Muxs	Muxer	RTT
1	go	cur	tcp	tls	ML1	X	go	1	tcp	tls	ML1	M1	rtt-1
2							go	cur	tcp	tls	ML2	M2	rtt-1
3							go	cur-1	tcp	tls	ML1	M1	rtt
4							go	cur-2	tcp	tls	ML1	M1	rtt
5							go	cur-3	tcp	tls	ML1	M1	rtt
6	go	cur	tcp	noise	ML1	X	go	cur	tcp	noise	ML1	M1	rtt-1
7							go	cur	tcp	noise	ML-2	M1	rtt-1
8							go	cur-1	tcp	noise	ML1	M1	rtt
9							go	cur-2	tcp	noise	ML1	M1	rtt
10							go	cur-3	tcp	noise	ML1	M1	rtt
11	go	cur	tcp	noise	ML1	X	rust	cur	tcp	noise	ML1	M1	rtt
12							rust	cur-1	tcp	noise	ML1	M1	rtt
13							rust	cur-2	tcp	noise	ML1	M1	rtt
14							rust	cur-3	tcp	noise	ML1	M1	rtt
15	go	cur	tcp	tls	ML1	X	JS	cur	tcp	noise	ML1	M1	rtt
16							JS	cur	tcp	noise	ML-2	M1	rtt
17							JS	cur-1	tcp	noise	ML1	M1	rtt
18							JS	cur-2	tcp	noise	ML1	M1	rtt
19							JS	cur-3	tcp	noise	ML1	M1	rtt
	go	cur	QUIC	-	-	X	go	cur	QUIC	-	-	-
							go	cur-1	QUIC	-	-	-
							go	cur-2	QUIC	-	-	-
							go	cur-3	QUIC	-	-	-
	go	cur	WebTransport	-	-	X	go	cur	WT	-	-	-
							go	cur-1	WT	-	-	-
							go	cur-2	WT	-	-	-
							go	cur-3	WT	-	-	-
	go	cur	WS	-	-	X	go	cur	WS	-	-	-
							go	cur-1	WS	-	-	-
							go	cur-2	WS	-	-	-
							go	cur-3	WS	-	-	-
	rust	cur	TCP	noise	-	X	JS	cur	TCP	noise	-
							JS

ML1 = ["/yamux/1.0.0", "/mplex/6.7.0"] M1 = “/yamux/1.0.0” , M2 = “/mplex/6.7.0”

ML2 = ["/mplex/6.7.0", "/yamux/1.0.0"] ML3 = [“/mplex/6.7.0”]

mxinden · 2022-11-04T11:37:39Z

Looking for an owner

While the IPDX team (i.e. @laurentsenta) works on the necessary support in testground/testground (see testground/testground#1493) I think there is value in us (libp2p team) to start working on our part, namely the generation of composition files based on a go.toml and rust.toml. See above and prove-of-concept in #55 (comment).

Any volunteers? @jxs or @julian88110 would either of you like to and have the capacity to own this?

John-LittleBearLabs · 2022-11-04T13:34:26Z

(libp2p team) to start working on our part, namely the generation of composition files based on a go.toml and rust.toml.

Whoever starts working on the official/permanent version - let me know so I can start basing my somewhat hacky version of working my webrtc test into this scheme on your work, rather than on PoC

mxinden · 2022-11-07T14:38:30Z

Any volunteers? @jxs or @julian88110 would either of you like to and have the capacity to own this?

Discussed out of band. @jxs will own this.

MarcoPolo · 2022-11-07T16:25:43Z

@mxinden / @jxs can you expand a bit? Is @jxs going to own the whole project or just the rust specific bits? I'm happy to be the DRI for the whole effort (full libp2p interop testing.)

julian88110 · 2022-11-07T17:05:51Z

Thanks @mxinden @jxs and @MarcoPolo ! I will have some bandwidth to help once the integration test and related code refactor effort is done. We can discuss for details.

mxinden · 2022-11-07T17:49:32Z

I see the following work streams here:

(Enumeration is for reference purpose, not to signal ordering.)

Support the runs feature in composition.yml files. Owned by @laurentsenta. See tracking issue EPIC: Implement multiple runs per compositions testground/testground#1493.
Generating composition.yml files based on rust.toml, go.toml and in the future js.toml files. Now owned by @jxs. See draft in Create a "simple" interop dashboard #55 (comment) written by @laurentsenta.
Updating the corresponding ping/xxx implementations to support the various dimensions. Ideally just forwarding command line flags.
- go-libp2p (no owner yet)
  - WebTransport
  - QUIC
  - WebRTC
    - Related adding webrtc support to ping/go by @John-LittleBearLabs in Add WebRTC transport interop ping test #67
  - Muxer negotiation in security protocol. I would suggest owned by @julian88110.)
- rust-libp2p (no owner yet)
  - QUIC
  - WebRTC
- js-libp2p
  - Blocked on @GlenDC's work on browser support for testground EPIC: implement Javascript & Browser support in Testground testground/testground#1386
  - Writing the actual ping/js with TCP on NodeJS
  - WebTransport
  - WebRTC
- nim-libp2p
  - See start here Nim ping test #70
Updating the various CI configurations in libp2p/test-plans and libp2p/{go,rust,js,nim}-libp2p.
Implementing a visualization of the test matrix, partially tracked in Canonical interop tests & dashboard #62

Is @jxs going to own the whole project or just the rust specific bits?

The plan was for @jxs to own (2) for both rust and go, potentially js in the future.

I'm happy to be the DRI for the whole effort (full libp2p interop testing.)

Thus far I was the DRI. That said, I am happy to hand that over to you @MarcoPolo. Let me know. We should probably do a hand-over in some fashion.

BigLep · 2022-11-07T18:11:57Z

Thanks @mxinden. This is helpful.

I want to minimize the amount of people involved, and I also know @mxinden is juggling a lot, so I am supportive if we're handing ownership over to others.

A few thoughts:

This endeavor needs technical ownership (make sure we're building the right thing in the right way) and project management ownership (communication, tracking, coordinating).
By default, given it is spanning all the libp2p implementations, I would expect @p-shahi to own the project management side at the minimum, but it's also ok if we're being intentional to have someone else take it.
This initiative is the at the top of our roadmap so I want to make sure it doesn't slip between the cracks. These kinds of projects tend to take longer than expected which is why I want to be attacking it rather than passive. Otherwise I worry this will drag on for months. We need to make sure someone is responsible for a clear checklist of the task definitions, task owners, and task dependencies. The list above from Max is a good start. I think it makes sense to move the relevant portions to the issue description.

p-shahi · 2022-11-07T19:20:37Z

Generating composition.yml files based on rust.toml, go.toml and in the future js.toml files. Now owned by @jxs. See draft in Create a "simple" interop dashboard #55 (comment) written by @laurentsenta.

Updating the corresponding ping/xxx implementations to support the various dimensions. Ideally just forwarding command line flags.

Updating the various CI configurations in libp2p/test-plans and libp2p/{go,rust,js,nim}-libp2p.

My preference is to make #61 the tracking issue for these efforts and tidy this Epic a bit. We can create child issues and assign to each team/DRI where necessary.

MarcoPolo · 2022-11-08T22:21:44Z

@mxinden sounds good. Lets do a handoff at some point this week or next as your schedule allows. I think having @jxs handle point 2 is good. I can put myself as the fallback for things without owners and delegate appropriately.

GlenDC · 2022-11-13T13:29:37Z

FYI, the upcoming week I'm planning to start getting the js-libp2p ping test to work in this repo, on a new branch.
I'll link it once I have something that I can share.

It will be based on the "finished" work done in open PR: testground/testground#1502

p-shahi · 2022-11-22T05:18:54Z

Thoughts on including plaintext in addition to TLS and Noise? I think libp2p/js-libp2p#1110 provides motivation to include it, at least as a "nice to have"

mxinden · 2022-11-23T14:48:44Z

Including plaintext works for me. Just need to make sure we don't advertise it as a to-be-used-in-production protocol.

p-shahi · 2023-02-07T02:13:25Z

closing in favor of #61

galargh added this to InterPlanetary Developer Experience Oct 11, 2022

galargh moved this to 🤔 Triage in InterPlanetary Developer Experience Oct 11, 2022

laurentsenta mentioned this issue Oct 11, 2022

EPIC: Implement multiple runs per compositions testground/testground#1493

Open

29 tasks

mxinden mentioned this issue Oct 11, 2022

Adding a webrtc test #49

Closed

mxinden added a commit to mxinden/rust-libp2p that referenced this issue Oct 11, 2022

ROADMAP: Remove testground entry

538b5b4

This is tracked on the [libp2p test-plans](libp2p/test-plans#44) see also libp2p/test-plans#53.

This was referenced Oct 13, 2022

Create a "simple" interop dashboard #55

Closed

Organization: add PROCESS.md and ROADMAP.md #44

Merged

mxinden mentioned this issue Oct 17, 2022

feat: Add WebRTC transport libp2p/rust-libp2p#2622

Merged

4 tasks

p-shahi mentioned this issue Oct 20, 2022

[tracking issue] Interop test-plans for all existing/developing libp2p transports #61

Open

28 tasks

GlenDC mentioned this issue Nov 12, 2022

EPIC: implement Javascript & Browser support in Testground testground/testground#1386

Closed

12 tasks

GlenDC mentioned this issue Nov 13, 2022

Draft: add ping testplan for JS (browser) #74

Closed

7 tasks

p-shahi pinned this issue Nov 18, 2022

p-shahi mentioned this issue Nov 28, 2022

test-plans 2022 Q4/2023 Q1 Roadmap #58

Open

p-shahi added the starmaps label Nov 28, 2022

tinytb added the status/in-progress In progress label Nov 28, 2022

p-shahi mentioned this issue Jan 22, 2023

docs(roadmap): Mark WebRTC browser-to-server as done libp2p/rust-libp2p#3357

Merged

p-shahi closed this as completed Feb 7, 2023

github-project-automation bot moved this from 🤔 Triage to 🥳 Done in InterPlanetary Developer Experience Feb 7, 2023

p-shahi unpinned this issue Feb 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EPIC: support multi-dimensional testing #53

EPIC: support multi-dimensional testing #53

laurentsenta commented Oct 10, 2022 •

edited by p-shahi

Loading

laurentsenta commented Oct 10, 2022

laurentsenta commented Oct 11, 2022 •

edited

Loading

marten-seemann commented Oct 11, 2022

mxinden commented Oct 12, 2022

mxinden commented Oct 12, 2022

mxinden commented Oct 12, 2022

julian88110 commented Oct 13, 2022

John-LittleBearLabs commented Oct 14, 2022

laurentsenta commented Oct 17, 2022 •

edited

Loading

laurentsenta commented Oct 17, 2022

julian88110 commented Oct 19, 2022 •

edited

Loading

mxinden commented Nov 4, 2022

John-LittleBearLabs commented Nov 4, 2022

mxinden commented Nov 7, 2022

MarcoPolo commented Nov 7, 2022

julian88110 commented Nov 7, 2022

mxinden commented Nov 7, 2022

BigLep commented Nov 7, 2022

p-shahi commented Nov 7, 2022 •

edited

Loading

MarcoPolo commented Nov 8, 2022

GlenDC commented Nov 13, 2022

p-shahi commented Nov 22, 2022

mxinden commented Nov 23, 2022

p-shahi commented Feb 7, 2023

EPIC: support multi-dimensional testing #53

EPIC: support multi-dimensional testing #53

Comments

laurentsenta commented Oct 10, 2022 • edited by p-shahi Loading

Tasks

Follow-up tasks

Description

A high-level approach:

Configurations

Related discussions and issues

laurentsenta commented Oct 10, 2022

laurentsenta commented Oct 11, 2022 • edited Loading

Generating Matrix Parameters

Supported versions

Config Structure

marten-seemann commented Oct 11, 2022

mxinden commented Oct 12, 2022

mxinden commented Oct 12, 2022

mxinden commented Oct 12, 2022

julian88110 commented Oct 13, 2022

John-LittleBearLabs commented Oct 14, 2022

laurentsenta commented Oct 17, 2022 • edited Loading

laurentsenta commented Oct 17, 2022

julian88110 commented Oct 19, 2022 • edited Loading

Tests are to be composed from the information extracted out of the resource files.

A test peer/host is customized by the following parameters:

A test case is composed by two or more test hosts:

Go transport list:

Rust transport list:

JS transport list:

mxinden commented Nov 4, 2022

John-LittleBearLabs commented Nov 4, 2022

mxinden commented Nov 7, 2022

MarcoPolo commented Nov 7, 2022

julian88110 commented Nov 7, 2022

mxinden commented Nov 7, 2022

BigLep commented Nov 7, 2022

p-shahi commented Nov 7, 2022 • edited Loading

MarcoPolo commented Nov 8, 2022

GlenDC commented Nov 13, 2022

p-shahi commented Nov 22, 2022

mxinden commented Nov 23, 2022

p-shahi commented Feb 7, 2023

laurentsenta commented Oct 10, 2022 •

edited by p-shahi

Loading

laurentsenta commented Oct 11, 2022 •

edited

Loading

laurentsenta commented Oct 17, 2022 •

edited

Loading

julian88110 commented Oct 19, 2022 •

edited

Loading

p-shahi commented Nov 7, 2022 •

edited

Loading