Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async backing integration & load test suite #706

Closed
3 tasks done
sandreim opened this issue Feb 24, 2023 · 14 comments · Fixed by paritytech/polkadot#6314
Closed
3 tasks done

Async backing integration & load test suite #706

sandreim opened this issue Feb 24, 2023 · 14 comments · Fixed by paritytech/polkadot#6314
Assignees
Labels
T10-tests This PR/Issue is related to tests.

Comments

@sandreim
Copy link
Contributor

sandreim commented Feb 24, 2023

The following need to be covered with Zombienet CI tests:

  • Network protocol upgrade deploying both master and async branch (compatibility)
  • Runtime upgrade while running both master and async backing branch nodes
  • Async backing test with a mix of collators collating via async backing and sync backing.

More advanced load testing on Versi. So far we only tested with empty parachain blocks. We should explore some options to have the parachains fully fill their blocks with transactions or use undying test collator to burn CPU and create large PoVs.

The load tests need to produce a grid of performance numbers with async backing enabled. The following metrics need to be covered:

  • parachain block times
  • finality lag
  • TPS

The input parameters for this test will be:

  • number of parachains
  • number of paravalidators
  • PoV size (transactions)

Additional input parameters: configuration needs to match kusama as closely as possible for the following:

  • relay_vrf_modulo_samples
  • n_delay_tranches
  • max_validators_per_core
  • needed_approvals

We need to build some automation if possible so that we can run these tests for different combinations of input parameters.

@rphmeier
Copy link
Contributor

Since asynchronous backing is not done yet, we should view this issue in two parts:

  1. generalized load-testing suite with parameters described above
  2. an extension of this load-testing suite for asynchronous backing once ready

@bredamatt bredamatt linked a pull request Feb 27, 2023 that will close this issue
@rphmeier rphmeier changed the title Async backing test suite Async backing integration & load test suite Feb 28, 2023
@rphmeier
Copy link
Contributor

FYI @bredamatt paritytech/polkadot#6791 would be a good base to test against until that is merged into the feature branch.

@bredamatt
Copy link
Contributor

@rphmeier @sandreim the zombienet tests are added in paritytech/polkadot#6314 now.

@bredamatt
Copy link
Contributor

bredamatt commented Mar 23, 2023

The first release of touchstone: https://github.com/paritytech/touchstone allows for runtime configuration of the input parameters:

  • relay_vrf_modulo_samples
  • n_delay_tranches
  • max_validators_per_core
  • needed_approvals

Just waiting for paritytech/testnet-manager#52 to finish so touchstone can be used in automation for Versi. It supports other rococo based networks atm. however.

@bredamatt
Copy link
Contributor

bredamatt commented Mar 29, 2023

@PierreBesson if we want to support automation in Versi w.r.t:

  • number of parachains
  • number of paravalidators

It would be nice if we could scale these numbers through the testnet-manager. However, I get that this may lead to lack of control over the number of deployments in the Kubernetes cluster as it wouldn't be GitOps based anymore, and hence lead to a lot of overhead wrt. managing the cluster. How about using Kubernetes operators alongside testnet-manager then, such as: https://github.com/swisscom-blockchain/polkadot-k8s-operator for the validators? We could potentially also write a new and simpler Rust based operator for load-testing specific parachains like glutton-parachain, or undying-collator for example as well. This would avoid this extra overhead, and allow for further automation, such as automatic and incremental increases in load during a load-test.

Edit:
Deployments through testnet-manager would break GitOps based deployments in Versi, so that idea is not going to be supported. Focus shifted towards helm chart based deployments accordingly, alongside zombienet based deployments.

@bredamatt
Copy link
Contributor

bredamatt commented Mar 31, 2023

This is an edited comment.

There was expressed a desire to deploy undying-collator in Versi whilst waiting for the Glutton Parachain runtime to be ready.

Undying didn't work as expected in Versi, whilst Glutton appears to be working as of recently.

This means we should eventually cojointly deploy the sTPS binaries alongside Glutton parachains to simulate the effects of various network load on TPS measures.

@bredamatt
Copy link
Contributor

bredamatt commented May 2, 2023

After discovering that sTPS is available, focus has shifted towards using sTPS for getting the TPS measurements the last couple of weeks. The repository can be found here: https://github.com/paritytech/polkadot-stps.

Work is currently being done to support:

  • running sTPS against parachains with zombienet
  • deploying sTPS in Versi, using the preferred deployment approach there (helm charts)

the branch is here: https://github.com/paritytech/polkadot-stps/tree/bredamatt/add-helm-charts

The most significant hurdle to make sTPS work for our load-testing purposes is to correctly prepare the accounts in these environments (Versi vs Zombiebet). This is because Versi and zombienet handles chain-specifications and sudo keys differently, which implies that account preparation differs per environment, forcing sTPS to be generic in this sense.

The end-goal is to use sTPS to measure the following:

  • sTPS for substrate solochain with BABE/GRANDPA for different validator set sizes in a geographically distributed network
  • sTPS for parachains prior to async backing
  • sTPS for parachains after async backing

A write up will follow containing specifications of underlying VMs, network topology and results accordingly.

@bkchr
Copy link
Member

bkchr commented May 2, 2023

his is because Versi and zombienet's handle chain-specifications and sudo keys differently, which implies that account preparation differs per environment, forcing sTPS to be generic in this sense.

Not sure what is doing it differently, but as you can do all this based on the chain spec. Aka insert the sudo key and pre funded accounts directly into genesis. Whatever tool isn't supporting this, should be "fixed" to support this.

@bredamatt
Copy link
Contributor

bredamatt commented Jun 8, 2023

A quick status update on this:

  • zombienet supports Parachain chain spec overrides, which means we can pre-fund accounts from genesis and run the TPS measurements both natively and with the Kubernetes provider (this is the preferred approach for sTPS rather than Versi IMO due to chainspec overrides being automated)
  • helm charts for deploying the sTPS tps and sender binaries have been added to the parity tech helm chart repo by DevOps
  • helmfile / helm based deployment of sTPS has been created and added to GitLab for deployments in Versi
  • GHA image build pipeline has been fixed and we now publish sender, tps and funder images to the paritytech repo with tags for the metadata that is used (for example "paritytech/stps-tps:rococo-latest")
  • Prometheus metrics have been added to the tps binary and these can be seen in Versi here: (TBA)
  • We can now scrape for Parachain TPS using the tps binary
  • The sTPS sender is now multithreaded
  • Deployment architecture has been established for Parachain TPS measures - we use N senders and 1 tps binary per Parachain in zombienet, and a multithreaded sender with M threads for deployments in Versi

@bredamatt
Copy link
Contributor

bredamatt commented Jun 8, 2023

Finalizing the task for a fully automated TPS measurement suite can be described as a script of some sorts (combo of .zndsl and JavaScript / typescript for example, or a crate by itself) which supports zombienet deployments into Kubernetes with adjustable parameters for:

  • N senders per Parachain TPS measurement
  • Number of transactions to send from the senders, and correspondingly how many Balances accounts to generate at genesis
  • How many parachains to monitor sTPS for
  • How many validators
  • Whether to monitor sTPS for the relaychain
  • POV size configuration
  • Memory-pool parameters
  • Other performance related parameter (in particular related to approval voting)

Eventually we could also add these additional parameters to introduce more realism to the tests:

  • Chaos mesh parameters
  • How many Glutton parachains to deploy
  • Glutton Parachain configuration
  • How many Malus nodes to deploy
  • Malus node configuration

This could potentially also serve as a good initial use-case for the zombienet SDK @pepoviola . Overall I think it is reasonable to have an overarching strategy for this type of observability work and outline a set of clear scenarios we want to test for that are:

  1. Realistic
  2. Relevant for the security assumptions of the protocol

I also suggest we define a clear CI strategy for these tests in the future. I would argue that having as much data (ideally distributions) of values can be useful for this purpose as it allows for more granular statistical analysis, but generating that in CI might be slow, so arguably some automated test scripts which can extract such insight could help whilst keeping things more lightweight in CI. For example, tighter distributions are better because performance / behavior is more predictable / reproducible and indicates control.

With time we should also evaluate other types of extrinsics and messages rather than solely Balances Transfers (transfer keep alive calls specifically), such as NFTs, XCMP and so forth.

@bredamatt
Copy link
Contributor

bredamatt commented Jun 22, 2023

@rphmeier @sandreim @bkchr Here is an update on some preliminary results for the parachain TPS measurements on sync-backing.

Setup

Hardware

Note that I am running the tests natively on my Macbook M1 Pro (32 GB RAM, 10 CPU) to avoid Kubernetes overhead. This could still introduce other forms of non-determinism, however. Once the Versi deployments work as expected, there may be less environmental overhead from different processes running on the same machine, but we would nevertheless have the Kubernetes overhead to take into account.

An option to avoid this type of non-determinism could be to spin up a big Linux VM in GCP and use a deterministic container like Hermit for example to see if it helps, but I would suspect this to not work out-of-the box (I am yet to test). In general, the distributions of TPS is quite varied, which indicates non-determinism, so figuring out what causes this non-detereminism should be investigated going forwards.

...Briefly put, I am running 2x validators, 1x collator, 1x tps monitor, 1x multi-threaded sender binary on one Macbook Pro M1 to execute the tests. I have some infrastructure as code written which can be used to spin up a big GCP VM if necessary.

Binaries

Parachain

There are three different testing parachain collator binaries being used for a comparison of the numbers:

  1. polkadot-parachain from Cumulus master with limits on the amount of normal extrinsics set to (NORMAL_DISPATCH_RATIO * MAXIMUM_BLOCK_WEIGHT), i.e. 75% of the maximum block execution time of 500ms for sync-backing.
  2. polkadot-parachain-no-max-total from Cumulus without limits on the amount of normal extrinsics. Note that the only parameter configured is the max_total for the normal dispatch class, and this is set to None.
  3. polkadot-parachain-no-max-no-reserved from Cumulus without limits on the amount of normal extrinsics, and no pre-reserved weight for the operational dispatch class.

I saw an improvement from a max of 1094 transfers to 1440 transfers per parablock by changing the normal dispatch limit to None, alongside a maximum weight going from 61% to 73%. This seems to indicate further configuration changes can be made to the runtime to fully consume the weight. Any comments on that would be appreciated. Here I simply modify the BlockWeight::builder() call: https://github.com/paritytech/cumulus/blob/615918878eeaccb961247be297cca0033881dfae/parachains/runtimes/testing/rococo-parachain/src/lib.rs#L152-L169.

I also use zombienet to override the genesis configuration for the parachain chainspec to make sure there are 100,000 pre-funded Balances pallet accounts included at start-up. I use the polkadot-stps funder binary to generate the .json file required for this.

Relaychain

The validator binary is running Polkadot, version polkadot 0.9.41-ee8aa322842.

I set the max_pov_size in the relay chain configuration to the default value of 5*1024*1024, and otherwise use the rococo-local chainspec for the remaining parameters. The other parameters (needed_approvals, max_validators_per_core, and so forth), can be adjusted if necessary as well.

Orchestration

Zombienet

The way I orchestrate the test is simply by spawning a simple zombienetwork with 2x validators running polkadot, and 1x collator running either of the polkadot-parachain binaries dependent upon which runtime configuration I want to test.

TPS counter

Because subxt requires the metadata at compile-time, I build the tps binary using the polkadot-parachain feature, which simply uses the pre-fetched metadata from the polkadot-parachain. Here, only the Balances pallet is truly required and interacted with which comes by default in most runtimes, so its not that critical, but nevertheless good practice to do so for someone working on performance testing of their runtime.

Once the zombienet is spun up (using zombienet -p native spawn), I connect the tps binary to one of the validators, and the collator of the parachain (with id 1000).

Once the tps binary is connected, it scrapes for CandidateIncluded events, and calculates in a separate thread however many Transfer calls there are in each candidate. It stops executing once the expected number of transfers are observed.

The command I use looks like this:

$ RUST_LOG=debug ./target/release/tps --para-finality --validator-url ws://127.0.0.1:9944 --collator-url ws://127.0.0.1:9999 --total-senders 1 --para-id 1000 --num 10000

Here --num 10000 indicates that I am expecting to count 10k transfers. Because I am using a single multi-threaded sender, I also set --total-senders to 1. When using horizontal scaling on the sender side, this number must be set accordingly.

Transfer sender

Similarly, the sender uses subxt, so I compile the sender with the polkadot-parachain feature as I am sending to the polkadot-parachain collator.

To speed up the sender it can be scaled either horizontally, or vertically as it is now possible to allocate multiple-threads for a single sender. Because I run the tests natively, I use 8 threads to send the 10k transfers to the collator like so:

./target/release/sender --node-url ws://127.0.0.1:9999 --threads 8 --num 10000

If horizontal scaling is desired, then it can be run in this way:

./target/release/sender --node-url ws://127.0.0.1:9999 --sender-index 0 --total-senders 2 --num 10000

then simply run the second sender in a different process:

./target/release/sender --node-url ws://127.0.0.1:9999 --sender-index 1 --total-senders 2 --num 10000

The sender first executes some pre-checks on each of the Balances accounts to make sure the nonce is 0.
Thereafter, it prepares the extrinsics, signs them and submits however many transactions desired by the invoker in parallel.

Note that here, a transaction is a transfer_keep_alive call. The FRAME benchmark for the weight associated with this call is defined here: https://github.com/paritytech/substrate/blob/be58e48f609c044fb832d78cf0f85c51e894d2dc/frame/balances/src/weights.rs#L75-L85. The proof size is estimated as 3593, and the worst-case execution time seems to be estimated as43_933_000 ps.

Results on Macbook Pro M1 Pro.

The logs on the sTPS scraper running in Parachain mode is provided in this section. Note that for these results I solely submit 10k transactions to post the logs here.

It is important to highlight that the PoV sizes are low for these tests. The default test PoV sizes range between 400-600 kb, which is ~10% of the maximum PoV size of 5.24MB. Yet the block weights observed range primarily between 50-70%, indicating that neither PoV size, nor execution time seems to be a limiting factor, which is surprising. I was expecting execution time to at least reach 100%. If the FRAME estimated proof size for the transfer_keep_alive call is ~3.6kb, I was expecting ~1100 such calls to create a PoV size of 3.960 MB, which would be roughly the 75% of the MAX_POV_SIZE of 3.932MB. Yet, it is not clear from the weight and PoV size seen in the logs on the collator. If anyone has any points on this that would be great, as I might have missed something when changing the BlockWeights::builder call. For example, could it be that the FRAME benchmarking is the limiting factor here, since the weights aren't maxed out on the parablocks? If so, then I suspect we could test a fourth binary with more optimistic weights for the transfer calls.

Default

[2023-06-22T11:45:25Z INFO  tps] Starting TPS in parachain mode
[2023-06-22T11:46:10Z INFO  tps] TPS on parablock 8: 83.34724
[2023-06-22T11:46:23Z INFO  tps] TPS on parablock 9: 91.083176
[2023-06-22T11:46:35Z INFO  tps] TPS on parablock 10: 90.99234
[2023-06-22T11:46:47Z INFO  tps] TPS on parablock 11: 91.13629
[2023-06-22T11:46:59Z INFO  tps] TPS on parablock 12: 87.93319
[2023-06-22T11:47:11Z INFO  tps] TPS on parablock 13: 91.17427
[2023-06-22T11:47:24Z INFO  tps] TPS on parablock 14: 91.18186
[2023-06-22T11:47:36Z INFO  tps] TPS on parablock 15: 90.87888
[2023-06-22T11:47:48Z INFO  tps] TPS on parablock 16: 91.440994
[2023-06-22T11:47:54Z INFO  tps] TPS on parablock 17: 24.252022
[2023-06-22T11:47:54Z INFO  tps] Average TPS is estimated as: 83.342026

The average is 83.342026. The maximum is 91.440994. The minimum is 24.252022, but as is clear, this is due to leftover transfers. The better minimum is 83.34724.

Given the maximum TPS number and an expected parablock time of 12s, the maximum number of transfers included is ~1097. We have ~1000 transfers in a single parablock on average. If there are 40 parachains and these numbers scale linearly, then in expectation this would be ~3,333 TPS for parachains.

No-max-total

[2023-06-22T11:35:49Z INFO  tps] Starting TPS in parachain mode
[2023-06-22T11:38:05Z INFO  tps] TPS on parablock 4: 81.04953
[2023-06-22T11:38:20Z INFO  tps] TPS on parablock 5: 77.54263
[2023-06-22T11:38:28Z INFO  tps] TPS on parablock 6: 93.73957
[2023-06-22T11:38:39Z INFO  tps] TPS on parablock 7: 87.773254
[2023-06-22T11:38:50Z INFO  tps] TPS on parablock 8: 80.2072
[2023-06-22T11:39:05Z INFO  tps] TPS on parablock 9: 101.16376
[2023-06-22T11:39:15Z INFO  tps] TPS on parablock 10: 85.88343
[2023-06-22T11:39:30Z INFO  tps] TPS on parablock 11: 104.25
[2023-06-22T11:39:35Z INFO  tps] TPS on parablock 12: 42.4485
[2023-06-22T11:39:35Z INFO  tps] Average TPS is estimated as: 83.78421

Average is 83.78421. The maximum is 104.25. The minimum is 42.4485 in this case, but such a low figure is also most likely due to leftover transfers as it is the last parablock. It would be more reasonable to use 77.54263 as the minimum here.

Given the maximum TPS number and an expected parablock time of 12s, the maximum number of transfers included is ~1251. We have ~1005 transfers in a single parablock on average. If there are 40 parachains and these numbers scale linearly, then in expectation this would be ~3,350 TPS for parachains.

No-max-no-reserved

Here the debug logs are added.

[2023-06-22T11:13:57Z DEBUG tps] Parablock time estimated at: 12002ms
[2023-06-22T11:13:57Z DEBUG tps] Checking extrinsics in parablock: 5
[2023-06-22T11:14:04Z DEBUG tps] Found 1173 transfers in parablock: 5
[2023-06-22T11:14:04Z INFO  tps] TPS on parablock 5: 97.73371
[2023-06-22T11:14:04Z DEBUG tps] Total transactions processed: 1173
[2023-06-22T11:14:04Z DEBUG tps] Remaining transactions to process: 8827
[2023-06-22T11:14:09Z DEBUG tps] New ParaHead: 0x46ad05c86235d04c5e2ba80c81a6e2bf4790196b0a724b45cc08ccf4b80a6f09 for ParaId: Id(1000)
[2023-06-22T11:14:09Z DEBUG tps] Received ParaHead: 0x46ad05c86235d04c5e2ba80c81a6e2bf4790196b0a724b45cc08ccf4b80a6f09
[2023-06-22T11:14:09Z DEBUG tps] Parablock time estimated at: 12016ms
[2023-06-22T11:14:09Z DEBUG tps] Checking extrinsics in parablock: 6
[2023-06-22T11:14:15Z DEBUG tps] Found 1082 transfers in parablock: 6
[2023-06-22T11:14:15Z INFO  tps] TPS on parablock 6: 90.04661
[2023-06-22T11:14:15Z DEBUG tps] Total transactions processed: 2255
[2023-06-22T11:14:15Z DEBUG tps] Remaining transactions to process: 7745
[2023-06-22T11:14:21Z DEBUG tps] New ParaHead: 0xfc689de5279cc7fe36afb2c09c50ce5621f137d087536dfc81115a8cca1b257b for ParaId: Id(1000)
[2023-06-22T11:14:21Z DEBUG tps] Received ParaHead: 0xfc689de5279cc7fe36afb2c09c50ce5621f137d087536dfc81115a8cca1b257b
[2023-06-22T11:14:21Z DEBUG tps] Parablock time estimated at: 11991ms
[2023-06-22T11:14:21Z DEBUG tps] Checking extrinsics in parablock: 7
[2023-06-22T11:14:27Z DEBUG tps] Found 1105 transfers in parablock: 7
[2023-06-22T11:14:27Z INFO  tps] TPS on parablock 7: 92.15244
[2023-06-22T11:14:27Z DEBUG tps] Total transactions processed: 3360
[2023-06-22T11:14:27Z DEBUG tps] Remaining transactions to process: 6640
[2023-06-22T11:14:33Z DEBUG tps] New ParaHead: 0x4da53a0a01a3f01f4a58309e949a3de5ea1f80b00ffd18ecb1b9dcae6d9d940f for ParaId: Id(1000)
[2023-06-22T11:14:33Z DEBUG tps] Received ParaHead: 0x4da53a0a01a3f01f4a58309e949a3de5ea1f80b00ffd18ecb1b9dcae6d9d940f
[2023-06-22T11:14:33Z DEBUG tps] Parablock time estimated at: 12001ms
[2023-06-22T11:14:33Z DEBUG tps] Checking extrinsics in parablock: 8
[2023-06-22T11:14:39Z DEBUG tps] Found 1110 transfers in parablock: 8
[2023-06-22T11:14:39Z INFO  tps] TPS on parablock 8: 92.49229
[2023-06-22T11:14:39Z DEBUG tps] Total transactions processed: 4470
[2023-06-22T11:14:39Z DEBUG tps] Remaining transactions to process: 5530
[2023-06-22T11:14:45Z DEBUG tps] New ParaHead: 0x8236e0990aaf764f43e84ff6dffca98b49697344d936f12a53e83511d7dc00cc for ParaId: Id(1000)
[2023-06-22T11:14:45Z DEBUG tps] Received ParaHead: 0x8236e0990aaf764f43e84ff6dffca98b49697344d936f12a53e83511d7dc00cc
[2023-06-22T11:14:45Z DEBUG tps] Parablock time estimated at: 11999ms
[2023-06-22T11:14:45Z DEBUG tps] Checking extrinsics in parablock: 9
[2023-06-22T11:14:50Z DEBUG tps] Found 1008 transfers in parablock: 9
[2023-06-22T11:14:50Z INFO  tps] TPS on parablock 9: 84.007
[2023-06-22T11:14:50Z DEBUG tps] Total transactions processed: 5478
[2023-06-22T11:14:50Z DEBUG tps] Remaining transactions to process: 4522
[2023-06-22T11:14:57Z DEBUG tps] New ParaHead: 0xac5d6608e5ec235c646b51d5af088c0d7cb2db2ca3ffb08dac9d47957f48ebef for ParaId: Id(1000)
[2023-06-22T11:14:57Z DEBUG tps] Received ParaHead: 0xac5d6608e5ec235c646b51d5af088c0d7cb2db2ca3ffb08dac9d47957f48ebef
[2023-06-22T11:14:57Z DEBUG tps] Parablock time estimated at: 12004ms
[2023-06-22T11:14:57Z DEBUG tps] Checking extrinsics in parablock: 10
[2023-06-22T11:15:05Z DEBUG tps] Found 1200 transfers in parablock: 10
[2023-06-22T11:15:05Z INFO  tps] TPS on parablock 10: 99.96668
[2023-06-22T11:15:05Z DEBUG tps] Total transactions processed: 6678
[2023-06-22T11:15:05Z DEBUG tps] Remaining transactions to process: 3322
[2023-06-22T11:15:09Z DEBUG tps] New ParaHead: 0x3dbe9101b6eac967c85df8152baacffd6972f01945be41854513ab226581877b for ParaId: Id(1000)
[2023-06-22T11:15:09Z DEBUG tps] Received ParaHead: 0x3dbe9101b6eac967c85df8152baacffd6972f01945be41854513ab226581877b
[2023-06-22T11:15:09Z DEBUG tps] Parablock time estimated at: 12008ms
[2023-06-22T11:15:09Z DEBUG tps] Checking extrinsics in parablock: 11
[2023-06-22T11:15:15Z DEBUG tps] Found 1062 transfers in parablock: 11
[2023-06-22T11:15:15Z INFO  tps] TPS on parablock 11: 88.44104
[2023-06-22T11:15:15Z DEBUG tps] Total transactions processed: 7740
[2023-06-22T11:15:15Z DEBUG tps] Remaining transactions to process: 2260
[2023-06-22T11:15:21Z DEBUG tps] New ParaHead: 0xaa5e4ae45ffeded00a1076b4b5527cedc4103e176ac02ee53482d5307e87a91e for ParaId: Id(1000)
[2023-06-22T11:15:21Z DEBUG tps] Received ParaHead: 0xaa5e4ae45ffeded00a1076b4b5527cedc4103e176ac02ee53482d5307e87a91e
[2023-06-22T11:15:21Z DEBUG tps] Parablock time estimated at: 11997ms
[2023-06-22T11:15:21Z DEBUG tps] Checking extrinsics in parablock: 12
[2023-06-22T11:15:29Z DEBUG tps] Found 1239 transfers in parablock: 12
[2023-06-22T11:15:29Z INFO  tps] TPS on parablock 12: 103.27582
[2023-06-22T11:15:29Z DEBUG tps] Total transactions processed: 8979
[2023-06-22T11:15:29Z DEBUG tps] Remaining transactions to process: 1021
[2023-06-22T11:15:33Z DEBUG tps] New ParaHead: 0xede17791b7982cd3d866c712c3808207dc106819129101670aef0ba28843fe25 for ParaId: Id(1000)
[2023-06-22T11:15:33Z DEBUG tps] Received ParaHead: 0xede17791b7982cd3d866c712c3808207dc106819129101670aef0ba28843fe25
[2023-06-22T11:15:33Z DEBUG tps] Parablock time estimated at: 11994ms
[2023-06-22T11:15:33Z DEBUG tps] Checking extrinsics in parablock: 13
[2023-06-22T11:15:38Z DEBUG tps] Found 1021 transfers in parablock: 13
[2023-06-22T11:15:38Z INFO  tps] TPS on parablock 13: 85.12589
[2023-06-22T11:15:38Z DEBUG tps] Total transactions processed: 10000
[2023-06-22T11:15:38Z DEBUG tps] Remaining transactions to process: 0
[2023-06-22T11:15:38Z INFO  tps] Average TPS is estimated as: 92.58239
[2023-06-22T11:15:38Z DEBUG tps] Signaled to stop execution!

Average TPS is 92.58239. The maximum TPS observed in this test is: 103.27582.
The minimum is: 84.007.

Given the maximum TPS number and an expected parablock time of 12s, the maximum number of transfers included is ~1239. We have ~1111 transfers in a single parablock on average. If there are 40 parachains and these numbers scale linearly, then in expectation this would be ~3,703 TPS for parachains.

TODOs

  • Wrap the whole testing process above in a single crate for native execution to make it easier to use and distribute
  • Extract TPS numbers and calculate some statistical measures from them (mean, variance, so forth) when running for 100k transfers.
  • Test in Versi for a more distributed setup running on reference hardware (NB! there is overhead in Kubernetes).
  • Assess where the non-determinism is coming from by profiling the collator.
  • Investigate in more detail why blocks aren't entirely full.

@rphmeier
Copy link
Contributor

Yet the block weights observed range primarily between 50-70%, indicating that neither PoV size, nor execution time seems to be a limiting factor, which is surprising. I was expecting execution time to at least reach 100%

If both PoV size and execution time are significantly lower than the cap, but the weights are maxing out, that'd indicate that the weight cap per block is off - which may just be an artifact of benchmarking.

@Sophia-Gold Sophia-Gold assigned Sophia-Gold and unassigned bredamatt Jul 3, 2023
@Sophia-Gold Sophia-Gold transferred this issue from paritytech/polkadot Aug 24, 2023
@the-right-joyce the-right-joyce added T10-tests This PR/Issue is related to tests. T8-parachains_engineering and removed I5-tests labels Aug 25, 2023
@eskimor
Copy link
Member

eskimor commented Oct 20, 2023

Is this done?

@Sophia-Gold
Copy link
Contributor

Is this done?

Yes. We've figured out how to run sTPS accurately for parachains. As far as the other goal, async backing on Kusama, I think @sandreim has tested 6s block times with somewhere around 50-80 cores and some gluttons.

helin6 pushed a commit to boolnetwork/polkadot-sdk that referenced this issue Feb 5, 2024
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.53 to 0.1.56.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](dtolnay/async-trait@0.1.53...0.1.56)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T10-tests This PR/Issue is related to tests.
Projects
Status: Completed
Development

Successfully merging a pull request may close this issue.

7 participants