Skip to content

2024 10 09 Testnet Rollback and Restart

steviez edited this page Oct 15, 2024 · 7 revisions

Edit

As of 2024-10-14 22:51 UTC testnet is back online. The instructions below are no longer relevant, but you will need to update your shred version and start normally:

--expected-shred-version 10323 \

This testnet restart is NOT urgent. Follow these instructions when you have time, but don’t skip sleep or disrupt other plans for this.

Summary

In order to test upgrading from v1.18 to v2.0 we would like to roll testnet back to the same version as mainnet-beta and disable v2.0 feature gates. In order to roll back v2.0 feature gates we need to shut testnet down and restart it from snapshots.

Attribute Value
Validator version Agave: v1.18.26
Snapshot slot 296_876_255
Restart slot 296_876_256
Shred version 10323
Expected bank hash Ea1SMrWMbGnCiYV7cAZP3gWTeFeu5UkEBb7oDp8VKaLr

Step 1. Stop validator process if you haven’t already

Step 2: Install Agave v2.0.13

This is necessary in order to create the correct snapshot in step 3.

Agave: agave-install init v2.0.13

Frankendancer: Please switch to Agave for this restart and the subsequent upgrade/downgrade cycle.

Step 3. Create snapshot

IMPORTANT: If you have an incremental snapshot for slot 296876256 from the previous restart attempt move it out of your snapshot directory before creating the new snapshot

This command creates a snapshot but removes the activated v2.0 feature gate accounts and the sysvar for partitioned epoch rewards.

agave-ledger-tool --ledger <ledger-path> create-snapshot \
--incremental \
--snapshot-archive-path  <snapshot-path> \
--hard-fork 296876255 \
--hard-fork 296876255 \
--deactivate-feature-gate \
9bn2vTJUsUcnpiZWbu2woSKtTGW3ErZC9ERv88SDqQjK \
ed9tNscbWLYBooxWA7FE2B5KHWs8A6sxfY8EzezEcoo \
EenyoWx9UMXYKpR8mW5Jmfmy2fRjzUtM7NduYMY8bx33 \
--remove-account \
SysvarEpochRewards1111111111111111111111111 \
--enable-capitalization-change \
--  296876255 <snapshot-path>

The output should include this at (or near) the end:

    Successfully created snapshot for slot 296876256, hash Ea1SMrWMbGnCiYV7cAZP3gWTeFeu5UkEBb7oDp8VKaLr: /home/sol/ledger-snapshots/incremental-snapshot-<BASE_SLOT>-296876256-<SNAPSHOT_HASH>.tar.zst
    Shred version: 10323

Note that each operator's snapshot file name may contain different base slot number and hash, but

  • the bank hash should be Ea1SMrWMbGnCiYV7cAZP3gWTeFeu5UkEBb7oDp8VKaLr
  • the second slot number should be 296876256
  • the shred version should be 10323

Once you have created a snapshot move all the other snapshots to a backup directory, so your snapshot directory contains one full snapshot and one incremental snapshot. Note that the <BASE_SLOT> in these two filenames should match.

snapshot-<BASE_SLOT>-<BASE_SNAPSHOT_HASH>.tar.zst
incremental-snapshot-<BASE_SLOT>-296876256-<SNAPSHOT_HASH>.tar.zst

If you fail to create a snapshot see the appendix for possible fixes.

Step 4: Install Agave v1.18.26

Agave: agave-install init v1.18.26

Frankendancer: Please switch to Agave for this restart and the subsequent upgrade/downgrade cycle.

Step 5: Update startup config and start your validator

Agave

Add these arguments to your validator startup script:

--wait-for-supermajority 296876256 \
--expected-shred-version 10323 \
--expected-bank-hash Ea1SMrWMbGnCiYV7cAZP3gWTeFeu5UkEBb7oDp8VKaLr \

As it starts, the validator will load the snapshot for slot 296876256 and wait for 80% of the stake to come online before producing/validating new blocks.

To confirm your restarted validator is correctly waiting for 80% stake, look for this periodic log message to confirm it is waiting:

INFO  solana_core::validator] Waiting for 80% of activated stake at slot 296876256 to be in gossip...

And if you have RPC enabled, ask it for the current slot:

solana --url http://127.0.0.1:8899 slot

Any number other than 296876256 means you did not complete the steps correctly.

Once started you should see log entries for “active stake” visible in gossip and “waiting for 80% of stake” to be visible. You can track these to see how the stake progresses.


Appendix (use this only if step 2 failed)

If you get an error like this:

Error: Slot 296876255 is not available

Or this:

Unable to process blockstore from starting slot <slot> to 296876255; the ending slot is less than the starting slot. The starting slot will be the latest snapshot slot, or genesis if the --no-snapshot flag is specified or if no snapshots are found.

Your snapshots directory contains a snapshot that is for a slot >296876255. If you also have a snapshot for slot <=296876255 then move snapshots for slots >296876255 to a backup directory and run the agave-ledger-tool command again. If you do not have a snapshot for slot <=296876255 then you will need to download a snapshot

If you successfully created a snapshot, resume the instructions above starting at Step 4. If you are unable to create a snapshot, follow the instructions below on downloading a snapshot.

If you couldn’t produce your snapshot locally follow these appendix steps

Step 1: Download a snapshot from a known validator

If you are unable to generate a snapshot locally for slot 296876256 you will need to download one from a known validator. Add these lines to your startup script.

--known-validator 5D1fNXzvv5NjV1ysLjirC4WY92RNsVH18vjmcszZd8on \
--expected-shred-version 10323 \

Remove the flag --no-snapshot-fetch in your startup script if it is present.

Step 2: After download, restart

Verify that you have a new snapshot in your snapshot directory. If the snapshot is done downloading, stop your validator process.

Add the flag --no-snapshot-fetch to your startup script

Resume the instructions above starting at Step 4.

Clone this wiki locally