Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sharded] use internal indexer ledger version in API context and add state indices restore #13883

Merged
merged 3 commits into from
Jul 25, 2024

Conversation

areshand
Copy link
Contributor

@areshand areshand commented Jul 1, 2024

Description

We should return the latest we get in the internal indexer even if there is a delay between the internal indexer and node main db.

For the delay between the internal indexer and main db, we will measure the gap in different scenarios and then decide when we should alert users of the delay.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Other (specify)

How Has This Been Tested?

  1. Existing API unit tests
  2. manual test running a sharded fullnode with internal indexer on
  • Tested API with
    ** curl localhost:8080/v1/accounts/0x1/resources
    ** curl localhost:8080/v1/accounts/0x1/transactions
    ** curl localhost:8080/v1/accounts/0x1/events/0
  • Test get start version with Fast sync
    ** panic happens as expected
    ** can handle hole created by fast sync

Key Areas to Review

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Jul 1, 2024

⏱️ 17h 27m total CI duration on this PR
Job Cumulative Duration Recent Runs
test-fuzzers 6h 42m 🟩🟩🟩🟩🟩 (+6 more)
execution-performance / single-node-performance 2h 47m 🟩🟩🟩🟥🟩 (+1 more)
forge-e2e-test / forge 1h 28m 🟥🟩🟩🟩 (+2 more)
rust-images / rust-all 1h 9m 🟩🟩🟩🟩🟩
forge-compat-test / forge 1h 5m 🟩🟩🟩 (+1 more)
forge-framework-upgrade-test / forge 47m 🟩🟩🟩
execution-performance / test-target-determinator 25m 🟩🟩🟩🟩🟩 (+1 more)
test-target-determinator 22m 🟩🟩🟩🟩🟩 (+1 more)
check 21m 🟩🟩🟩🟩🟩 (+1 more)
general-lints 19m 🟩🟩🟩🟩🟩 (+5 more)
rust-move-tests 15m 🟩
check-dynamic-deps 13m 🟩🟩🟩🟩🟩 (+7 more)
rust-move-tests 12m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
indexer-grpc-e2e-tests / test-indexer-grpc-docker-compose 8m 🟩🟩🟩🟩🟩
semgrep/ci 4m 🟩🟩🟩🟩🟩 (+6 more)
rust-move-tests 3m 🟩
rust-move-tests 2m
file_change_determinator 2m 🟩🟩🟩🟩🟩 (+6 more)
file_change_determinator 2m 🟩🟩🟩🟩🟩 (+6 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+1 more)
permission-check 43s 🟩🟩🟩🟩🟩 (+7 more)
permission-check 36s 🟩🟩🟩🟩🟩 (+6 more)
permission-check 32s 🟩🟩🟩🟩🟩 (+6 more)
permission-check 30s 🟩🟩🟩🟩🟩 (+7 more)
determine-docker-build-metadata 25s 🟩🟩🟩🟩🟩 (+1 more)
permission-check 15s 🟩🟩🟩🟩🟩 (+1 more)
rust-move-tests 1s

🚨 1 job on the last run was significantly faster/slower than expected

Job Duration vs 7d avg Delta
execution-performance / single-node-performance 30m 18m +69%

settingsfeedbackdocs ⋅ learn more about trunk.io

@msmouse
Copy link
Contributor

msmouse commented Jul 1, 2024

why? doesn't seem any harm to keep these checks?

@areshand
Copy link
Contributor Author

areshand commented Jul 1, 2024

why? doesn't seem any harm to keep these checks?

The API will constantly error out. The main use of the internal indexer are by these APIs https://github.com/aptos-labs/aptos-core/blob/internal_indexer_pruner/api/src/transactions.rs#L982. We should just return the latest version the internal indexer has https://aptos-org.slack.com/archives/C03N83P7QUC/p1719867754704079?thread_ts=1719867622.533329&cid=C03N83P7QUC

@areshand areshand force-pushed the get_latest_version_from_internal_indexer branch from a9c9677 to 8b7dd2d Compare July 2, 2024 18:40
@areshand areshand changed the title Remove ledger too new check [Sharded 7] Use internal indexer ledger version and info in API context Jul 2, 2024
@areshand areshand force-pushed the get_latest_version_from_internal_indexer branch 3 times, most recently from ba05c27 to 2616aa9 Compare July 3, 2024 03:42
@areshand areshand added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Jul 3, 2024
@areshand areshand force-pushed the get_latest_version_from_internal_indexer branch from 2616aa9 to d96bfde Compare July 3, 2024 04:07

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@areshand areshand force-pushed the get_latest_version_from_internal_indexer branch from d96bfde to d4f0141 Compare July 3, 2024 14:54

This comment has been minimized.

This comment has been minimized.

@areshand areshand enabled auto-merge (rebase) July 3, 2024 15:12
@areshand areshand changed the title [Sharded 7] Use internal indexer ledger version and info in API context [Sharded 7] Use internal indexer ledger version and add checking for determining start indexing version Jul 3, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@grao1991 grao1991 force-pushed the get_latest_version_from_internal_indexer branch 3 times, most recently from 441c7e9 to f144bf0 Compare July 24, 2024 22:24

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@grao1991 grao1991 force-pushed the get_latest_version_from_internal_indexer branch from f144bf0 to 127356e Compare July 24, 2024 23:35

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@grao1991 grao1991 force-pushed the get_latest_version_from_internal_indexer branch from 127356e to 26a2863 Compare July 25, 2024 00:26
@grao1991 grao1991 force-pushed the get_latest_version_from_internal_indexer branch from 26a2863 to e8b3e6b Compare July 25, 2024 00:32

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on e8b3e6be6474638820878df1ffeff08101b99185

two traffics test: inner traffic : committed: 8986.926441104215 txn/s, submitted: 9248.668812942453 txn/s, failed submission: 0.10520191794141345 txn/s, expired: 261.7423718382367 txn/s, latency: 2749.750545797215 ms, (p50: 2700 ms, p90: 3300 ms, p99: 4400 ms), latency samples: 3417020
two traffics test : committed: 100.04995262204073 txn/s, latency: 2050.4735 ms, (p50: 2100 ms, p90: 2300 ms, p99: 5100 ms), latency samples: 2000
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.249, avg: 0.220", "QsPosToProposal: max: 1.304, avg: 0.494", "ConsensusProposalToOrdered: max: 0.337, avg: 0.299", "ConsensusOrderedToCommit: max: 0.416, avg: 0.398", "ConsensusProposalToCommit: max: 0.709, avg: 0.697"]
Max round gap was 1 [limit 4] at version 1864201. Max no progress secs was 5.68482 [limit 15] at version 1864201.
Test Ok

Copy link
Contributor

✅ Forge suite compat success on 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> e8b3e6be6474638820878df1ffeff08101b99185

Compatibility test results for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> e8b3e6be6474638820878df1ffeff08101b99185 (PR)
1. Check liveness of validators at old version: 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5
compatibility::simple-validator-upgrade::liveness-check : committed: 6803.122401736363 txn/s, latency: 4551.339925558313 ms, (p50: 3600 ms, p90: 4800 ms, p99: 26100 ms), latency samples: 282100
2. Upgrading first Validator to new version: e8b3e6be6474638820878df1ffeff08101b99185
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 6325.187926217032 txn/s, latency: 4181.34156137888 ms, (p50: 4700 ms, p90: 5400 ms, p99: 5500 ms), latency samples: 128220
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6797.721802960912 txn/s, latency: 4654.975373390558 ms, (p50: 4800 ms, p90: 6400 ms, p99: 7100 ms), latency samples: 233000
3. Upgrading rest of first batch to new version: e8b3e6be6474638820878df1ffeff08101b99185
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 7208.719902651851 txn/s, latency: 3817.7058920047734 ms, (p50: 4200 ms, p90: 4600 ms, p99: 4800 ms), latency samples: 134080
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6562.187771765435 txn/s, latency: 4608.812252118459 ms, (p50: 4300 ms, p90: 7600 ms, p99: 7900 ms), latency samples: 228940
4. upgrading second batch to new version: e8b3e6be6474638820878df1ffeff08101b99185
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 10117.655670415072 txn/s, latency: 2956.3459489993543 ms, (p50: 2800 ms, p90: 4600 ms, p99: 5900 ms), latency samples: 185880
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 8681.845943468283 txn/s, latency: 4191.259807824296 ms, (p50: 3300 ms, p90: 8200 ms, p99: 13800 ms), latency samples: 291400
5. check swarm health
Compatibility test for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> e8b3e6be6474638820878df1ffeff08101b99185 passed
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> e8b3e6be6474638820878df1ffeff08101b99185

Compatibility test results for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> e8b3e6be6474638820878df1ffeff08101b99185 (PR)
Upgrade the nodes to version: e8b3e6be6474638820878df1ffeff08101b99185
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1209.4016610514059 txn/s, submitted: 1211.4772931927985 txn/s, failed submission: 2.0756321413925733 txn/s, expired: 2.0756321413925733 txn/s, latency: 2559.5104881769644 ms, (p50: 1800 ms, p90: 4500 ms, p99: 10200 ms), latency samples: 104880
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1066.9926785448474 txn/s, submitted: 1068.9621071967965 txn/s, failed submission: 1.969428651949062 txn/s, expired: 1.969428651949062 txn/s, latency: 2825.487233388023 ms, (p50: 2100 ms, p90: 4800 ms, p99: 10200 ms), latency samples: 97520
5. check swarm health
Compatibility test for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> e8b3e6be6474638820878df1ffeff08101b99185 passed
Upgrade the remaining nodes to version: e8b3e6be6474638820878df1ffeff08101b99185
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1158.1166891747685 txn/s, submitted: 1161.1950323774354 txn/s, failed submission: 3.0783432026669373 txn/s, expired: 3.0783432026669373 txn/s, latency: 2663.947114106702 ms, (p50: 2100 ms, p90: 4800 ms, p99: 8900 ms), latency samples: 105340
Test Ok

@grao1991 grao1991 merged commit 589066d into main Jul 25, 2024
47 checks passed
@grao1991 grao1991 deleted the get_latest_version_from_internal_indexer branch July 25, 2024 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:build-images when this label is present github actions will start build+push rust images from the PR. CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants