Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: adding big block tests to the main branch #3612

Merged
merged 83 commits into from
Jul 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
f786710
increases grpc max receive message size
staheri14 Jun 7, 2024
f522545
increases tx load per txsim
staheri14 Jun 7, 2024
5f69e38
sets latency to false
staheri14 Jun 7, 2024
e40282a
fixes a bug
staheri14 Jun 7, 2024
21ae2b4
udates server's grpc config
staheri14 Jun 7, 2024
29bd8fd
adds comments about the updated grpc configs
staheri14 Jun 7, 2024
39ea235
adds big block tests manifest
staheri14 Jun 21, 2024
006c047
adds two nodes test
staheri14 Jun 21, 2024
ec16b8f
moves all the post test checks to a method of the benchmark
staheri14 Jun 21, 2024
32ff915
invokes check results in the two node simple test
staheri14 Jun 21, 2024
7df7375
increases the sleep time to 1 s
staheri14 Jun 21, 2024
0ccce39
reduces test duration
staheri14 Jun 21, 2024
bef7743
comments out start
staheri14 Jun 21, 2024
1e267d2
uncomments the Start portion
staheri14 Jun 21, 2024
cd68725
uses a diferent branch for validator's and txclient images
staheri14 Jun 21, 2024
b725abc
Merge branch 'main' into sanaz/grpc-error-fix
staheri14 Jun 21, 2024
afb65e2
deletes excess newline
staheri14 Jun 21, 2024
7ed141a
uses default grpc setting for txsim
staheri14 Jun 21, 2024
08eaad4
increases test duration
staheri14 Jun 21, 2024
ca311f0
comments out unused const
staheri14 Jun 21, 2024
b09af7e
increases grpcMaxRecvMsgSize
staheri14 Jun 21, 2024
3d9ac3e
increases grpcMaxSendMsgSize
staheri14 Jun 21, 2024
24c9475
reformats a few lines
staheri14 Jun 21, 2024
d13014a
increases TwoNodeSimple test time
staheri14 Jun 21, 2024
c120d6e
Merge remote-tracking branch 'origin/main' into sanaz/grpc-error-fix
staheri14 Jun 21, 2024
3674afe
points back to the old pr
staheri14 Jun 21, 2024
0296951
Merge branch 'sanaz/grpc-error-fix' into sanaz/migrate-big-block-tests
staheri14 Jun 21, 2024
2d4e991
changes txclient and app version
staheri14 Jun 21, 2024
aa83908
increases TwoNodeSimple test duration
staheri14 Jun 21, 2024
72525d7
Merge remote-tracking branch 'origin/main' into sanaz/migrate-big-blo…
staheri14 Jun 24, 2024
e2a3074
Merge remote-tracking branch 'origin/main' into sanaz/migrate-big-blo…
staheri14 Jun 24, 2024
07cd982
includes larger network size tests
staheri14 Jun 24, 2024
d157862
adds more logs for the state of genesis nodes
staheri14 Jun 24, 2024
16d9bad
Merge branch 'main' into sanaz/migrate-big-block-tests
staheri14 Jun 25, 2024
bf23951
removes underscore from test names
staheri14 Jun 25, 2024
f62e6bc
adds another log message for when the node is in sync
staheri14 Jun 25, 2024
abf2fd5
corrects test names
staheri14 Jun 26, 2024
73632ee
introduces testName for each test
staheri14 Jun 26, 2024
eccbc46
fixes a log message
staheri14 Jun 27, 2024
40187d6
retracts the original waiting time
staheri14 Jun 27, 2024
cda3035
pulls block summary
staheri14 Jun 27, 2024
8b5ff94
decrease number of sequences
staheri14 Jun 27, 2024
66a5bf7
uses pull block summary
staheri14 Jun 27, 2024
1cac853
increases test duration
staheri14 Jun 27, 2024
3bb91cc
decreases blob sequences
staheri14 Jun 27, 2024
57a9b5e
adds testName for TwoNodeSimple
staheri14 Jun 27, 2024
9293571
extends TwoNodeSimple run time
staheri14 Jun 27, 2024
ab54060
sets sequences to 25 per validator
staheri14 Jun 27, 2024
c8e065a
increases grpc message sizes
staheri14 Jun 27, 2024
f84e805
updates setup configs
staheri14 Jun 27, 2024
7a5a3cd
retracts old grpc settings
staheri14 Jun 27, 2024
10dd879
fixes format of logged messages
staheri14 Jun 27, 2024
d0cc39c
reverts sequences back to 60
staheri14 Jun 27, 2024
6da8cd6
introduces M(i)B and G(i)B consts
staheri14 Jun 27, 2024
fbbd488
includes units of txsim grpc configs
staheri14 Jun 27, 2024
7aa56b8
adjusts latency to 70ms
staheri14 Jun 27, 2024
1364730
uses latest version for txsim
staheri14 Jun 27, 2024
9a53cfc
checks the max block size instead of tx count
staheri14 Jun 27, 2024
584d1bd
makes expected block size a param
staheri14 Jun 27, 2024
ba54a85
formulates expected block size
staheri14 Jun 27, 2024
5ecdc5f
includes 100 node test
staheri14 Jun 27, 2024
fea9b1b
updates a chain id
staheri14 Jun 27, 2024
f9f1d00
revises old comments
staheri14 Jun 27, 2024
113db47
fixes an stale comment
staheri14 Jun 27, 2024
51b05e0
adds a todo
staheri14 Jun 27, 2024
a54791a
refactors the test names
staheri14 Jun 27, 2024
e31ae70
Merge remote-tracking branch 'origin/main' into sanaz/migrate-big-blo…
staheri14 Jun 27, 2024
04261a1
revises chain id names to match the network size
staheri14 Jun 27, 2024
640ac4d
renames mib and mb
staheri14 Jun 27, 2024
265b7fe
introduces a function to generate chain id
staheri14 Jun 27, 2024
6e780f5
changes unit
staheri14 Jun 27, 2024
16f694b
indicates block size units
staheri14 Jun 27, 2024
187ec04
fixes an issue
staheri14 Jun 27, 2024
e8b5dfd
handles error of obtaining push configs from env vars
staheri14 Jun 28, 2024
9fbb325
fixes an issue
staheri14 Jul 2, 2024
e158a07
shortens test duration
staheri14 Jul 2, 2024
06af840
Merge remote-tracking branch 'origin/main' into sanaz/migrate-big-blo…
staheri14 Jul 2, 2024
77042f8
uses txsimVersion
staheri14 Jul 2, 2024
de0b76f
reduces code duplication
staheri14 Jul 3, 2024
6fe58f7
brings back PullRoundStateTraces to make the PR non-breaking
staheri14 Jul 3, 2024
d93c2aa
uses testnet constants for byte units
staheri14 Jul 3, 2024
599f718
removes unused constants
staheri14 Jul 3, 2024
0a4fdc1
Merge remote-tracking branch 'origin/main' into sanaz/migrate-big-blo…
staheri14 Jul 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions test/e2e/benchmark/benchmark.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,14 @@
package main

import (
"context"
"fmt"
"log"
"time"

"github.com/celestiaorg/celestia-app/v2/pkg/appconsts"
"github.com/celestiaorg/celestia-app/v2/test/e2e/testnet"
"github.com/celestiaorg/celestia-app/v2/test/util/testnode"
"github.com/tendermint/tendermint/pkg/trace"
)

Expand Down Expand Up @@ -118,3 +121,56 @@ func (b *BenchmarkTest) Run() error {

return nil
}

func (b *BenchmarkTest) CheckResults(expectedBlockSizeBytes int64) error {
log.Println("Checking results")

// if local tracing was enabled,
// pull block summary table from one of the nodes to confirm tracing
// has worked properly.
if b.manifest.LocalTracingType == "local" {
if _, err := b.Node(0).PullBlockSummaryTraces("."); err != nil {
return fmt.Errorf("failed to pull traces: %w", err)
}
}

// download traces from S3, if enabled
if b.manifest.PushTrace && b.manifest.DownloadTraces {
// download traces from S3
pushConfig, err := trace.GetPushConfigFromEnv()
if err != nil {
return fmt.Errorf("failed to get push config: %w", err)
}
err = trace.S3Download("./traces/", b.manifest.ChainID,
pushConfig)
if err != nil {
return fmt.Errorf("failed to download traces from S3: %w", err)
}
}

log.Println("Reading blockchain")
blockchain, err := testnode.ReadBlockchain(context.Background(),
b.Node(0).AddressRPC())
testnet.NoError("failed to read blockchain", err)

targetSizeReached := false
maxBlockSize := int64(0)
for _, block := range blockchain {
if appconsts.LatestVersion != block.Version.App {
return fmt.Errorf("expected app version %d, got %d", appconsts.LatestVersion, block.Version.App)
}
size := int64(block.Size())
if size >= expectedBlockSizeBytes {
targetSizeReached = true
break
}
if size > maxBlockSize {
maxBlockSize = size
}
Comment on lines +167 to +169
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think these lines will be executed if the previous conditional evaluates to true b/c the break so consider re-ordering this above the previous conditional so that maxBlockSize accounts for the block that actually exceeds >= expectedBlockSizeBytes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current version is correct as well. I see what causes confusion here: the value of maxBlockSize is only used when the target size is not reached. That is, if size >= expectedBlockSizeBytes is never hit, and if that condition is never met, then maxBlockSize correctly reflects the most recent maximum size observed. If we do hit the expectedBlockSizeBytes condition and if size >= expectedBlockSizeBytes evaluates to true, then the value of maxBlockSize does not matter. Nevertheless, to resolve future confusion, I will apply your suggestion in a follow up PR, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #3670

}
if !targetSizeReached {
return fmt.Errorf("max reached block size is %d byte and is not within the expected range of %d and %d bytes", maxBlockSize, expectedBlockSizeBytes, b.manifest.MaxBlockBytes)
}

return nil
}
7 changes: 7 additions & 0 deletions test/e2e/benchmark/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,13 @@ func main() {

tests := []Test{
{"TwoNodeSimple", TwoNodeSimple},
{"TwoNodeBigBlock8MB", TwoNodeBigBlock8MB},
{"TwoNodeBigBlock32MB", TwoNodeBigBlock32MB},
{"TwoNodeBigBlock8MBLatency", TwoNodeBigBlock8MBLatency},
{"TwoNodeBigBlock64MB", TwoNodeBigBlock64MB},
{"LargeNetworkBigBlock8MB", LargeNetworkBigBlock8MB},
{"LargeNetworkBigBlock32MB", LargeNetworkBigBlock32MB},
{"LargeNetworkBigBlock64MB", LargeNetworkBigBlock64MB},
}

// check the test name passed as an argument and run it
Expand Down
19 changes: 19 additions & 0 deletions test/e2e/benchmark/manifest.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package main

import (
"fmt"
"time"

"github.com/celestiaorg/celestia-app/v2/app"
Expand Down Expand Up @@ -95,3 +96,21 @@ func (m *Manifest) GetConsensusParams() *tmproto.ConsensusParams {
cparams.Block.MaxBytes = m.MaxBlockBytes
return cparams
}

// summary generates a summary of the Manifest struct to be used as chain id.
func (m *Manifest) summary() string {
latency := 0
if m.EnableLatency {
latency = 1
}
maxBlockMB := m.MaxBlockBytes / testnet.MB
summary := fmt.Sprintf("v%d-t%d-b%d-bw%dmb-tc%d-tp%d-l%d-%s-%dmb",
m.Validators, m.TxClients,
m.BlobSequences, m.PerPeerBandwidth/testnet.MB,
m.TimeoutCommit/time.Second, m.TimeoutPropose/time.Second,
latency, m.Mempool, maxBlockMB)
if len(summary) > 50 {
return summary[:50]
}
return summary
}
163 changes: 120 additions & 43 deletions test/e2e/benchmark/throughput.go
Original file line number Diff line number Diff line change
@@ -1,26 +1,65 @@
package main

import (
"context"
"fmt"
"log"
"time"

"github.com/celestiaorg/celestia-app/v2/pkg/appconsts"
"github.com/celestiaorg/celestia-app/v2/test/e2e/testnet"
"github.com/celestiaorg/celestia-app/v2/test/util/testnode"
"github.com/tendermint/tendermint/pkg/trace"
)

const (
seed = 42
)

var bigBlockManifest = Manifest{
ChainID: "test",
Validators: 2,
TxClients: 2,
ValidatorResource: testnet.Resources{
MemoryRequest: "12Gi",
MemoryLimit: "12Gi",
CPU: "8",
Volume: "20Gi",
},
TxClientsResource: testnet.Resources{
MemoryRequest: "1Gi",
MemoryLimit: "3Gi",
CPU: "2",
Volume: "1Gi",
},
SelfDelegation: 10000000,
// @TODO Update the CelestiaAppVersion and TxClientVersion to the latest
// version of the main branch once the PR#3261 is merged by addressing this
// issue https://github.com/celestiaorg/celestia-app/issues/3603.
CelestiaAppVersion: "pr-3261",
TxClientVersion: "pr-3261",
EnableLatency: false,
LatencyParams: LatencyParams{70, 0}, // in milliseconds
BlobSequences: 60,
BlobsPerSeq: 6,
BlobSizes: "200000",
PerPeerBandwidth: 5 * testnet.MB,
UpgradeHeight: 0,
TimeoutCommit: 11 * time.Second,
TimeoutPropose: 80 * time.Second,
Mempool: "v1", // ineffective as it always defaults to v1
BroadcastTxs: true,
Prometheus: false,
GovMaxSquareSize: 512,
MaxBlockBytes: 7800000,
TestDuration: 5 * time.Minute,
LocalTracingType: "local",
PushTrace: true,
}

func TwoNodeSimple(logger *log.Logger) error {
latestVersion, err := testnet.GetLatestVersion()
testnet.NoError("failed to get latest version", err)

logger.Println("=== RUN TwoNodeSimple", "version:", latestVersion)
testName := "TwoNodeSimple"
logger.Printf("Running %s\n", testName)
logger.Println("version", latestVersion)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this missing a formatting directive?

Suggested change
logger.Println("version", latestVersion)
logger.Printf("version %v\n", latestVersion)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They both indeed work identically:
The Println function prints the supplied inputs with comma separation, whereas in Printf, I achieved the same result using formatting. Here is a sample output from the tests:

test-e2e-benchmark2024/07/08 11:34:05 Running TwoNodeSimple
test-e2e-benchmark2024/07/08 11:34:05 version 8caa580

Their implementation is just inconsistent, which I can fix in a follow-up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #3670


manifest := Manifest{
ChainID: "test-e2e-two-node-simple",
Expand All @@ -31,11 +70,11 @@ func TwoNodeSimple(logger *log.Logger) error {
CelestiaAppVersion: latestVersion,
TxClientVersion: testnet.TxsimVersion,
EnableLatency: false,
LatencyParams: LatencyParams{100, 10}, // in milliseconds
LatencyParams: LatencyParams{70, 0}, // in milliseconds
BlobsPerSeq: 6,
BlobSequences: 50,
BlobSequences: 60,
BlobSizes: "200000",
PerPeerBandwidth: 5 * 1024 * 1024,
PerPeerBandwidth: 5 * testnet.MB,
UpgradeHeight: 0,
TimeoutCommit: 1 * time.Second,
TimeoutPropose: 1 * time.Second,
Expand All @@ -47,11 +86,11 @@ func TwoNodeSimple(logger *log.Logger) error {
LocalTracingType: "local",
PushTrace: false,
DownloadTraces: false,
TestDuration: 2 * time.Minute,
TestDuration: 3 * time.Minute,
TxClients: 2,
}

benchTest, err := NewBenchmarkTest("E2EThroughput", &manifest)
benchTest, err := NewBenchmarkTest(testName, &manifest)
testnet.NoError("failed to create benchmark test", err)

defer func() {
Expand All @@ -63,42 +102,80 @@ func TwoNodeSimple(logger *log.Logger) error {

testnet.NoError("failed to run the benchmark test", benchTest.Run())

// post test data collection and validation
testnet.NoError("failed to check results", benchTest.CheckResults(1*testnet.MB))

// if local tracing is enabled,
// pull round state traces to confirm tracing is working as expected.
if benchTest.manifest.LocalTracingType == "local" {
if _, err := benchTest.Node(0).PullRoundStateTraces("."); err != nil {
return fmt.Errorf("failed to pull round state traces: %w", err)
}
}
return nil
}

// download traces from S3, if enabled
if benchTest.manifest.PushTrace && benchTest.manifest.DownloadTraces {
// download traces from S3
pushConfig, _ := trace.GetPushConfigFromEnv()
err := trace.S3Download("./traces/", benchTest.manifest.ChainID,
pushConfig)
if err != nil {
return fmt.Errorf("failed to download traces from S3: %w", err)
}
}
func runBenchmarkTest(logger *log.Logger, testName string, manifest Manifest) error {
logger.Printf("Running %s\n", testName)
manifest.ChainID = manifest.summary()
log.Println("ChainID: ", manifest.ChainID)
benchTest, err := NewBenchmarkTest(testName, &manifest)
testnet.NoError("failed to create benchmark test", err)

log.Println("Reading blockchain")
blockchain, err := testnode.ReadBlockchain(context.Background(),
benchTest.Node(0).AddressRPC())
testnet.NoError("failed to read blockchain", err)

totalTxs := 0
for _, block := range blockchain {
if appconsts.LatestVersion != block.Version.App {
return fmt.Errorf("expected app version %d, got %d", appconsts.LatestVersion, block.Version.App)
}
totalTxs += len(block.Data.Txs)
}
if totalTxs < 10 {
return fmt.Errorf("expected at least 10 transactions, got %d", totalTxs)
}
defer func() {
log.Print("Cleaning up testnet")
benchTest.Cleanup()
}()

testnet.NoError("failed to setup nodes", benchTest.SetupNodes())
testnet.NoError("failed to run the benchmark test", benchTest.Run())
expectedBlockSize := int64(0.90 * float64(manifest.MaxBlockBytes))
testnet.NoError("failed to check results", benchTest.CheckResults(expectedBlockSize))

return nil
}

func TwoNodeBigBlock8MB(logger *log.Logger) error {
manifest := bigBlockManifest
manifest.MaxBlockBytes = 8 * testnet.MB
return runBenchmarkTest(logger, "TwoNodeBigBlock8MB", manifest)
}

func TwoNodeBigBlock8MBLatency(logger *log.Logger) error {
manifest := bigBlockManifest
manifest.MaxBlockBytes = 8 * testnet.MB
manifest.EnableLatency = true
manifest.LatencyParams = LatencyParams{70, 0}
return runBenchmarkTest(logger, "TwoNodeBigBlock8MBLatency", manifest)
}

func TwoNodeBigBlock32MB(logger *log.Logger) error {
manifest := bigBlockManifest
manifest.MaxBlockBytes = 32 * testnet.MB
return runBenchmarkTest(logger, "TwoNodeBigBlock32MB", manifest)
}

func TwoNodeBigBlock64MB(logger *log.Logger) error {
manifest := bigBlockManifest
manifest.MaxBlockBytes = 64 * testnet.MB
return runBenchmarkTest(logger, "TwoNodeBigBlock64MB", manifest)
}

func LargeNetworkBigBlock8MB(logger *log.Logger) error {
manifest := bigBlockManifest
manifest.MaxBlockBytes = 8 * testnet.MB
manifest.Validators = 50
manifest.TxClients = 50
manifest.BlobSequences = 2
return runBenchmarkTest(logger, "LargeNetworkBigBlock8MB", manifest)
}

func LargeNetworkBigBlock32MB(logger *log.Logger) error {
manifest := bigBlockManifest
manifest.MaxBlockBytes = 32 * testnet.MB
manifest.Validators = 50
manifest.TxClients = 50
manifest.BlobSequences = 2
return runBenchmarkTest(logger, "LargeNetworkBigBlock32MB", manifest)
}

func LargeNetworkBigBlock64MB(logger *log.Logger) error {
manifest := bigBlockManifest
manifest.MaxBlockBytes = 64 * testnet.MB
manifest.Validators = 50
manifest.TxClients = 50
manifest.BlobSequences = 2
return runBenchmarkTest(logger, "LargeNetworkBigBlock64MB", manifest)
}
8 changes: 7 additions & 1 deletion test/e2e/testnet/defaults.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,10 @@ var DefaultResources = Resources{
Volume: "1Gi",
}

const TxsimVersion = "pr-3541"
const (
TxsimVersion = "pr-3541"
MB = 1000 * 1000
GB = 1000 * MB
MiB = 1024 * 1024
GiB = 1024 * MiB
)
14 changes: 14 additions & 0 deletions test/e2e/testnet/node.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,20 @@ func (n *Node) PullRoundStateTraces(path string) ([]trace.Event[schema.RoundStat
return nil, nil
}

// PullBlockSummaryTraces retrieves the block summary traces from a node.
// It will save them to the provided path.
func (n *Node) PullBlockSummaryTraces(path string) ([]trace.Event[schema.BlockSummary], error,
) {
addr := n.AddressTracing()
log.Info().Str("Address", addr).Msg("Pulling block summary traces")

err := trace.GetTable(addr, schema.BlockSummary{}.Table(), path)
if err != nil {
return nil, fmt.Errorf("getting table: %w", err)
}
return nil, nil
}

// Resources defines the resource requirements for a Node.
type Resources struct {
// MemoryRequest specifies the initial memory allocation for the Node.
Expand Down
9 changes: 6 additions & 3 deletions test/e2e/testnet/setup.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,10 @@ import (
)

func MakeConfig(node *Node, opts ...Option) (*config.Config, error) {
cfg := config.DefaultConfig()
cfg := app.DefaultConsensusConfig()
cfg.TxIndex.Indexer = "kv"
cfg.Mempool.MaxTxsBytes = 1 * GiB
cfg.Mempool.MaxTxBytes = 8 * MiB
staheri14 marked this conversation as resolved.
Show resolved Hide resolved
cfg.Moniker = node.Name
cfg.RPC.ListenAddress = "tcp://0.0.0.0:26657"
cfg.P2P.ExternalAddress = fmt.Sprintf("tcp://%v", node.AddressP2P(false))
Expand Down Expand Up @@ -95,7 +98,7 @@ func MakeAppConfig(_ *Node) (*serverconfig.Config, error) {
srvCfg.MinGasPrices = fmt.Sprintf("0.001%s", app.BondDenom)
// updating MaxRecvMsgSize and MaxSendMsgSize allows submission of 128MiB worth of
// transactions simultaneously which is useful for big block tests.
srvCfg.GRPC.MaxRecvMsgSize = 128 * 1024 * 1024
srvCfg.GRPC.MaxSendMsgSize = 128 * 1024 * 1024
srvCfg.GRPC.MaxRecvMsgSize = 128 * MiB
srvCfg.GRPC.MaxSendMsgSize = 128 * MiB
return srvCfg, srvCfg.ValidateBasic()
}
Loading
Loading