Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docker] release aptos-node and aptos-indexer-grpc separately #10664

Merged
merged 1 commit into from
Oct 25, 2023

Conversation

rustielin
Copy link
Contributor

Description

So we can have a separate indexer GRPC release process. This decouples the versioning of the node and indexer.

This PR introduces the concept of "release groups" within the release-images.mjs tool, which groups together images that are released together.

  • aptos-node: validator (incl. validator-testing), faucet, tools
  • aptos-indexer-grpc: indexer-grpc

Test Plan

Add a new dry-run mode, and test from there.

Release with aptos network tag. This releases aptos-node

$ IMAGE_TAG_PREFIX=devnet AWS_ACCOUNT_ID=bla GCP_DOCKER_ARTIFACT_REPO_US=bla GCP_DOCKER_ARTIFACT_REPO=bla GIT_SHA=bla ./docker/release-images.mjs --wait-for-image-seconds=3600 --dry-run
Lockfile is up to date, resolution step is skipped
Already up to date
Done in 386ms
$ command -v crane
/opt/homebrew/bin/crane
INFO: dry run: true
INFO: image release group: aptos-node
INFO: image names to release: ["validator","validator-testing","faucet","tools"]
INFO: copying bla/validator:performance_bla to bla/validator:devnet_performance
INFO: copying bla/validator:performance_bla to bla/validator:devnet_performance
INFO: copying bla/validator:performance_bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/validator:devnet_performance
INFO: copying bla/validator:performance_bla to docker.io/aptoslabs/validator:devnet_performance
INFO: copying bla/validator:bla to bla/validator:devnet
INFO: copying bla/validator:bla to bla/validator:devnet
INFO: copying bla/validator:bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/validator:devnet
INFO: copying bla/validator:bla to docker.io/aptoslabs/validator:devnet
INFO: copying bla/validator-testing:performance_bla to bla/validator-testing:devnet_performance
INFO: copying bla/validator-testing:performance_bla to bla/validator-testing:devnet_performance
INFO: copying bla/validator-testing:performance_bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/validator-testing:devnet_performance
INFO: copying bla/validator-testing:bla to bla/validator-testing:devnet
INFO: copying bla/validator-testing:bla to bla/validator-testing:devnet
INFO: copying bla/validator-testing:bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/validator-testing:devnet
INFO: copying bla/faucet:bla to bla/faucet:devnet
INFO: copying bla/faucet:bla to bla/faucet:devnet
INFO: copying bla/faucet:bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/faucet:devnet
INFO: copying bla/faucet:bla to docker.io/aptoslabs/faucet:devnet
INFO: copying bla/tools:bla to bla/tools:devnet
INFO: copying bla/tools:bla to bla/tools:devnet
INFO: copying bla/tools:bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/tools:devnet
INFO: copying bla/tools:bla to docker.io/aptoslabs/tools:devnet

Release with aptos-node-vX.Y.Z tag. This releases aptos-node:

$ IMAGE_TAG_PREFIX=aptos-node-vX.Y.Z AWS_ACCOUNT_ID=bla GCP_DOCKER_ARTIFACT_REPO_US=bla GCP_DOCKER_ARTIFACT_REPO=bla GIT_SHA=bla ./docker/release-images.mjs --wait-for-image-seconds=3600 --dry-run        
Lockfile is up to date, resolution step is skipped
Already up to date
Done in 357ms
$ command -v crane
/opt/homebrew/bin/crane
INFO: dry run: true
INFO: image release group: aptos-node
INFO: image names to release: ["validator","validator-testing","faucet","tools"]
INFO: copying bla/validator:performance_bla to bla/validator:aptos-node-vX.Y.Z_performance
INFO: copying bla/validator:performance_bla to bla/validator:aptos-node-vX.Y.Z_performance
INFO: copying bla/validator:performance_bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/validator:aptos-node-vX.Y.Z_performance
INFO: copying bla/validator:performance_bla to docker.io/aptoslabs/validator:aptos-node-vX.Y.Z_performance
INFO: copying bla/validator:bla to bla/validator:aptos-node-vX.Y.Z
INFO: copying bla/validator:bla to bla/validator:aptos-node-vX.Y.Z
INFO: copying bla/validator:bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/validator:aptos-node-vX.Y.Z
INFO: copying bla/validator:bla to docker.io/aptoslabs/validator:aptos-node-vX.Y.Z
INFO: copying bla/validator-testing:performance_bla to bla/validator-testing:aptos-node-vX.Y.Z_performance
INFO: copying bla/validator-testing:performance_bla to bla/validator-testing:aptos-node-vX.Y.Z_performance
INFO: copying bla/validator-testing:performance_bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/validator-testing:aptos-node-vX.Y.Z_performance
INFO: copying bla/validator-testing:bla to bla/validator-testing:aptos-node-vX.Y.Z
INFO: copying bla/validator-testing:bla to bla/validator-testing:aptos-node-vX.Y.Z
INFO: copying bla/validator-testing:bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/validator-testing:aptos-node-vX.Y.Z
INFO: copying bla/faucet:bla to bla/faucet:aptos-node-vX.Y.Z
INFO: copying bla/faucet:bla to bla/faucet:aptos-node-vX.Y.Z
INFO: copying bla/faucet:bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/faucet:aptos-node-vX.Y.Z
INFO: copying bla/faucet:bla to docker.io/aptoslabs/faucet:aptos-node-vX.Y.Z
INFO: copying bla/tools:bla to bla/tools:aptos-node-vX.Y.Z
INFO: copying bla/tools:bla to bla/tools:aptos-node-vX.Y.Z
INFO: copying bla/tools:bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/tools:aptos-node-vX.Y.Z
INFO: copying bla/tools:bla to docker.io/aptoslabs/tools:aptos-node-vX.Y.Z

NEW: release with aptos-indexer-grpc-vA.B.C. This releases indexer grpc only:

$ IMAGE_TAG_PREFIX=aptos-indexer-grpc-vA.B.C AWS_ACCOUNT_ID=bla GCP_DOCKER_ARTIFACT_REPO_US=bla GCP_DOCKER_ARTIFACT_REPO=bla GIT_SHA=bla ./docker/release-images.mjs --wait-for-image-seconds=3600 --dry-run    
Lockfile is up to date, resolution step is skipped
Already up to date
Done in 353ms
$ command -v crane
/opt/homebrew/bin/crane
INFO: dry run: true
INFO: image release group: aptos-indexer-grpc
INFO: image names to release: ["indexer-grpc"]
INFO: copying bla/indexer-grpc:bla to bla/indexer-grpc:aptos-indexer-grpc-vA.B.C
INFO: copying bla/indexer-grpc:bla to bla/indexer-grpc:aptos-indexer-grpc-vA.B.C
INFO: copying bla/indexer-grpc:bla to bla.dkr.ecr.us-west-2.amazonaws.com/aptos/indexer-grpc:aptos-indexer-grpc-vA.B.C
INFO: copying bla/indexer-grpc:bla to docker.io/aptoslabs/indexer-grpc:aptos-indexer-grpc-vA.B.C

@rustielin rustielin requested review from a team as code owners October 25, 2023 00:20
Copy link
Contributor

@larry-aptos larry-aptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat!

@rustielin rustielin enabled auto-merge (squash) October 25, 2023 16:08
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.7.2 ==> b19816fb6232b01b5165086ee9015606be515191

Compatibility test results for aptos-node-v1.7.2 ==> b19816fb6232b01b5165086ee9015606be515191 (PR)
1. Check liveness of validators at old version: aptos-node-v1.7.2
compatibility::simple-validator-upgrade::liveness-check : committed: 4401 txn/s, latency: 7190 ms, (p50: 7400 ms, p90: 8800 ms, p99: 12300 ms), latency samples: 171660
2. Upgrading first Validator to new version: b19816fb6232b01b5165086ee9015606be515191
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1846 txn/s, latency: 15672 ms, (p50: 18700 ms, p90: 22100 ms, p99: 22300 ms), latency samples: 92320
3. Upgrading rest of first batch to new version: b19816fb6232b01b5165086ee9015606be515191
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 1777 txn/s, latency: 15926 ms, (p50: 19300 ms, p90: 22000 ms, p99: 22600 ms), latency samples: 92440
4. upgrading second batch to new version: b19816fb6232b01b5165086ee9015606be515191
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 3610 txn/s, latency: 8699 ms, (p50: 9900 ms, p90: 12200 ms, p99: 12500 ms), latency samples: 144400
5. check swarm health
Compatibility test for aptos-node-v1.7.2 ==> b19816fb6232b01b5165086ee9015606be515191 passed
Test Ok

@github-actions
Copy link
Contributor

✅ Forge suite realistic_env_max_load success on b19816fb6232b01b5165086ee9015606be515191

two traffics test: inner traffic : committed: 8404 txn/s, latency: 4673 ms, (p50: 4500 ms, p90: 5400 ms, p99: 11100 ms), latency samples: 3622460
two traffics test : committed: 100 txn/s, latency: 2176 ms, (p50: 2000 ms, p90: 2600 ms, p99: 6700 ms), latency samples: 1780
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.208, avg: 0.197", "QsPosToProposal: max: 0.185, avg: 0.146", "ConsensusProposalToOrdered: max: 0.633, avg: 0.598", "ConsensusOrderedToCommit: max: 0.525, avg: 0.489", "ConsensusProposalToCommit: max: 1.142, avg: 1.087"]
Max round gap was 1 [limit 4] at version 1877862. Max no progress secs was 4.33465 [limit 10] at version 1877862.
Test Ok

@rustielin rustielin merged commit 940f98a into main Oct 25, 2023
82 of 84 checks passed
@rustielin rustielin deleted the rustielin/release-grpc-separately branch October 25, 2023 16:47
@github-actions
Copy link
Contributor

✅ Forge suite framework_upgrade success on aptos-node-v1.7.2 ==> b19816fb6232b01b5165086ee9015606be515191

Compatibility test results for aptos-node-v1.7.2 ==> b19816fb6232b01b5165086ee9015606be515191 (PR)
Upgrade the nodes to version: b19816fb6232b01b5165086ee9015606be515191
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 6857 txn/s, latency: 4910 ms, (p50: 4800 ms, p90: 8100 ms, p99: 10000 ms), latency samples: 240020
5. check swarm health
Compatibility test for aptos-node-v1.7.2 ==> b19816fb6232b01b5165086ee9015606be515191 passed
Test Ok

bowenyang007 pushed a commit to bowenyang007/aptos-core that referenced this pull request Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants