Skip to content
This repository has been archived by the owner on Jan 6, 2025. It is now read-only.

Proposed beacon state test format #21

Open
djrtwo opened this issue Feb 27, 2019 · 10 comments
Open

Proposed beacon state test format #21

djrtwo opened this issue Feb 27, 2019 · 10 comments

Comments

@djrtwo
Copy link
Contributor

djrtwo commented Feb 27, 2019

Test suite name

beacon_state

Test case format

config: <key/value pairs of phase 0 constants>
verify_signatures: <bool>
initial_state: <key/value pairs of fields of BeaconState>
blocks: <list of blocks>
expected_state: <key/value pairs of subset of BeaconState fields to state>
expected_state_root: <tree hash>

Format field notes

  • config [key/value]
    • all fields of phase 0 constants.
    • any missing config constant defaults to value found in phase 0 spec
  • verify_signatures (optional) [bool]
    • default to False
    • Any node that can produce a block should already have a flag enabling state transitions without verifying signatures. In most state tests containing valid transitions, this should be disabled.
    • Generally only enabled in state tests that fail due to invalid signatures.
  • initial_state [key/value]
    • all fields of BeaconState
  • blocks [list]
    • A list of blocks to be processed sequentially on top of the initial state
  • expected_state [key/value]
    • a subset of fields of BeaconState containing the expected values of the resulting state
  • expected_state_root (optional) [32-byte hex string]
    • hash_tree_root(state) after processing to the latest block in blocks

Example

title: Sample Slot Update State Test
summary: Tests slot updates
test_suite: beacon_state
fork: tchaikovsky
version: 1.0

test_cases:
- config:
    SHARD_COUND: 16
    TARGET_COMMITTEE_SIZE: 16
    GENESIS_SLOT: 50
    GENESIS_EPOCH: 10
    MIN_ATTESTATION_INCLUSION_DELAY: 1
    SLOTS_PER_EPOCH: 5
    LATEST_RANDAO_MIXES_LENGTH: 20
    LATEST_BLOCK_ROOTS_LENGTH: 20
    LATEST_ACTIVE_INDEX_ROOTS_LENGTH: 20
    LATEST_SLASHED_EXIT_LENGTH: 20
  initial_state:
    slot: 50
    genesis_time: 0
    fork:
      previous_version: 0
      current_version: 0
      epoch: 10
    validator_registry:
      - pubkey: '0x424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242'
        withdrawal_credentials: '0x0000000000000000000000000000000000000000000000000000000000000000'
        activation_epoch: 10
        exit_epoch: 100000000
        withdrawable_epoch: 10000000000
        initiated_exit: False
        slashed: False
      - pubkey: '0x606060606060606060606060606060606060606060606060606060606060606060606060606060606060606060606060'
        withdrawal_credentials: '0x0000000000000000000000000000000000000000000000000000000000000001'
        activation_epoch: 10
        exit_epoch: 100000000
        withdrawable_epoch: 10000000000
        initiated_exit: False
        slashed: False
    validator_balances: [32000000000, 32000000000]
    validator_registry_update_epoch: 10
    latest_randao_mixes: ['0x0000000000000000000000000000000000000000000000000000000000000000', '0x0000000000000000000000000000000000000000000000000000000000000000', ..., '0x0000000000000000000000000000000000000000000000000000000000000000']
    previous_shuffling_start_shard: 0
    current_shuffling_start_shard: 0
    previous_shuffling_epoch: 10
    current_shuffling_epoch: 10
    previous_shuffling_seed: '0x0000000000000000000000000000000000000000000000000000000000000000'
    current_shuffling_seed: '0x0000000000000000000000000000000000000000000000000000000000000000'
    previous_justified_epoch: 10
    justified_epoch: 10
    justification_bitfield: 0
    finalized_epoch: 10
    latest_crosslinks:
      - epoch: 10
        crosslink_data_root: '0x0000000000000000000000000000000000000000000000000000000000000000'
      - epoch: 10
        crosslink_data_root: '0x0000000000000000000000000000000000000000000000000000000000000000'
      ...
      - epoch: 10
        crosslink_data_root: '0x0000000000000000000000000000000000000000000000000000000000000000'
    batched_block_roots: []
    latest_eth1_data:
      deposit_root: '0x0000000000000000000000000000000000000000000000000000000000000000'
      block_hash: '0x0000000000000000000000000000000000000000000000000000000000000000'
    deposit_index: 0
  verify_signatures: False
  blocks:
    - slot: 51
      parent_root: '0x0000000000000000000000000000000000000000000000000000000000000000'
      randao_reveal: '0x0000000000000000000000000000000000000000000000000000000000000000'
      eth1_data:
        deposit_root: '0x0000000000000000000000000000000000000000000000000000000000000000'
        block_hash: '0x0000000000000000000000000000000000000000000000000000000000000000'
      body:
        proposer_slashings: []
        attester_slashings: []
        attestations: []
        deposits: []
        voluntary_exits: []
        transfers: []
      signature: '0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000'
  expected_state:
    slot: 51

Notes

  • By having expected_state_root as optional and expected_state allowed to specify just a subset of fields, we use this format for very narrowly targeted sets of tests (like the above update slot test, or maybe a deposit processing test) as well as for full blown state tests that test very complex transitions.
  • The above format assumes that the block parent_root will not be assessed for validity. Could just leave it out and handle locally in each client as makes sense.
  • The above format assumes that proposer, attestation, and randao reveal signatures are not to be verified when verify_signatures == False.
  • We might specify "defaults" for fields across most objects to reduce the verbosity of tests. For example, Validator epoch fields should all just default to FAR_FUTURE_EPOCH unless explicitly specified otherwise. Block body arrays might just default to empty unless specified. etc.
  • Should probably be able to specify an invalid state transition. Maybe an is_invalid: True field.
@djrtwo djrtwo changed the title Proposed state test format Proposed beacon state test format Feb 27, 2019
@jannikluhn
Copy link
Contributor

What's a bit unclear to me is if this type of test is for block processing or for state transitions in general. I think it would be useful to have tests specific for slot and epoch processing that don't contain any blocks, and to have tests for block processing without slot or epoch processing. In addition, full tests make sense as well, to check if slots/blocks/epochs are processed in the right order, but we probably need much fewer of those as most complexity lies in block processing. But for full tests we should specify first and last processed slot and epoch as well. This could be implicitly defined (e.g. we just say the test runner has to process all empty slots between the initial state slot and the block slots, and do potential epoch transitions up until the one directly following the last block).

One concern I have is that some of the tests could get unnecessarily verbose. Defaults for block and state fields as you mentioned would help a lot already. In addition to that, Justin proposed reduced constant sets for testing which could shorten the config section. Maybe we can replace config with something like config_base and config_changes: config_base would be an identifier of the constant set (e.g. main, simple, testnet_abc, ...) and config_changes would be the (optional and usually unnecessary) dictionary of changes to the base.

Only in state tests that fail due to invalid signatures should this be enabled.

Just a minor side note, but I think we should have at least a few full tests with valid signatures that are checked. Otherwise we might get false positives (or false negatives? I can never tell), i.e. tests pass even though the clients signature verification is broken.

@mratsim
Copy link

mratsim commented Feb 27, 2019

Only in state tests that fail due to invalid signatures should this be enabled.

Just a minor side note, but I think we should have at least a few full tests with valid signatures that are checked. Otherwise we might get false positives (or false negatives? I can never tell), i.e. tests pass even though the clients signature verification is broken.

I think we should distinguish unit tests for logical components and integration tests for the whole machinery as debugging the whole machinery is quite complex and it helps a lot to target small pieces at a time.

Also one thing I'm afraid of is the timing with state tests:

  • 64 slots per epoch is quite long for testing, in nimbus we use 8 slots per epochs during development
  • debug mode: right now 6 seconds slot is barely enough to process the state in debug mode in Nim (which is a fast language) so we compile with optimizations enabled (but still stacktraces, range checks ...)

@djrtwo
Copy link
Contributor Author

djrtwo commented Mar 13, 2019

is if this type of test is for block processing or for state transitions in general

A state transition is (pre_state, block) -> post_state where multiple things happen under the hood. In that respect. I'd like these state tests to be block by block based. I see some value in pulling out components of the state transition (epoch-transition, slot-transition, etc) but the more we go granular here, the more testing machinery we are requiring of each client and the more specific interface within the state transition we are requiring as well.

I have a series of state tests using the above format ready to release with the next version of the spec (v0.5.0). The tests are quite verbose but adding defaults for constants or constants sets aren't going to save us much space. It's the pre_state (specifically the cache arrays) that really get us.

I think we should distinguish unit tests for logical components and integration tests for the whole machinery

The tests I currently have do so in specifying just a subset of the expected_state to test against (for example testing that processing an block at the next slot increments the slot rather than specifying the entirety of the state).

Do you have something more granular in mind @mratsim ?

Just a minor side note, but I think we should have at least a few full tests with valid signatures that are checked

agreed


I'll share the tests I've generated from the executable python spec soon

@paulhauner
Copy link
Contributor

paulhauner commented Mar 14, 2019

It's pretty minor, but I'd like it if all the "config" values were in lower case.

All of these variables are lower-case in my structs and as far as my YAML parser is concerned, SHARD_COUNT != shard_count.

I suspect this will be the case for other clients too, considering that it'll need to be a variable if it's set from YAML and variables are generally lower-case.

No big deal though, I can devise a work-around.

Thanks for you efforts on it :)

@djrtwo
Copy link
Contributor Author

djrtwo commented Mar 14, 2019

ah, interesting. standard in python is that constants are capitalized so my config expects caps :)

The vectors I'm releasing right now are capitalized. I'm down to switch to whichever is preferred by most languages.

@paulhauner
Copy link
Contributor

I'm just calling lower_string on the entire YAML for now!

Case-insensitive blockchains are the future, IMO.

@protolambda
Copy link

protolambda commented Mar 15, 2019

For the Go executable spec it is a little bit more complicated to make this config design work:

The goals of the Go version are:

  • compile time guarantees, no need to deal with unnecessary edge cases during run-time.
  • define arrays by their size, not as slices that are actively maintained
  • sizes are part of types
  • clean optimization: if the compiler can work with it, it's better/faster

Now I've implemented a spec-test runner, but the current implementation, using ldflags to inject compile time variable changes from the config, doesn't work. It doesn't override constants, only variables, which don't offer these features. What I can do however is to implement it as a build constraint: simply refer to a different definition of the constants. But this is not something you would want to do for every single test case.

And I imagine that many others would prefer to just make it part of the build, instead of dealing with all the extra complexity (ensure lengths everywhere, initialize things, unclear types)

Given that the "config" section is only really used to make tests more efficient, can't we just agree on a few build presets?

E.g.:
phase0.yaml:

SHARD_COUNT: 1024
TARGET_COMMITTEE_SIZE: 128
...

minimal.yaml:

SHARD_COUNT: 8
SLOTS_PER_EPOCH: 5
LATEST_RANDAO_MIXES_LENGTH: 20
LATEST_BLOCK_ROOTS_LENGTH: 20
LATEST_ACTIVE_INDEX_ROOTS_LENGTH: 20
LATEST_SLASHED_EXIT_LENGTH: 20
...

giant.yaml:

SHARD_COUNT: 32768
SLOTS_PER_EPOCH: 128
...

And a few more.

And then simply reference these in all the test suite cases, instead of duplicating the config contents:

- config: minimal
  initial_state:
...
  expected_state:
    slot: 51

And then we can have the best of both worlds (compile time guarantees and testing) :)

Edit: alternatively, we could also list these configs by their name in a special configs.yaml

@paulhauner
Copy link
Contributor

And I imagine that many others would prefer to just make it part of the build, instead of dealing with all the extra complexity (ensure lengths everywhere, initialize things, unclear types)

This has been raised several times in Lighthouse: do we make the constants constant or keep them as variables? Currently we keep all the spec "constants" as variables in a ChainSpec struct.

We prefer variables for two main reasons:

  • Flexibility during the development process.
  • Future support for testnets without recompiling.

We've come to the decision that constants don't give us significantly more safety (we can quite reasonably make assumptions thatChainSpec is sane and consistent throughout the program). We acknowledge that passing the ChainSpec around is annoying, however not annoying enough to sacrifice flexibility or UX.

Personally, I'd vote to keep all constants variable (lol) because defining scenarios is (a) more work and (b) less flexible. @protolambda I understand that you're in a different situation though, you need to prioritize readability above (almost) all else -- I wouldn't complain if defining scenarios was implemented, just sharing my POV :)

@protolambda
Copy link

The two options are not exclusive.

The config scenarios idea boils down to the following:

  • are defined separately from the test cases (either in separate files, or a separate configs.yaml)
    • no duplicate configs
    • easier to maintain, smaller changes
  • are referenced by the test cases
    • state test cases are leaner
  • can be converted to build-time constraints
    • each config can be converted to a config_<name>.<cpp/go/etc>. and used as substitute during compile time. Makes the minimal spec implementations easier; readability, compile-time guarantees, and no runtime config management.
  • can optionally be interpreted during run-time/test-time to change settings, if you choose for "variable constants" in your implementation.
    • tests can still work with them; you even have to do less parsing for many small tests
    • possibly useful during run-time changes (i.e. forks): you can use one config for slot < X and change to the next for slot >= X

@protolambda
Copy link

Also, I would like the state test format to support the definition of a test-case for a sub-state-transition. Each of these transitions already supports a generalized in/output, the same as the big block/epoch transition itself:

(state)->state for epoch transitions, and parts of it:

  • Eth1
  • Justification
  • Crosslinks
  • Rewards and Penalties
  • Ejections
  • Validator Registry
  • Slashings
  • Exit Queue
  • Finish

(state,block)->state for block transitions, and parts of it:

  • Header
  • Randao
  • Eth1
  • Proposer Slashings
  • Attester Slashings
  • Attestations
  • Deposits
  • Voluntary Exits
  • Transfers

Or alternatively, we create a separate format or two for this, since state sub-transitions don't have block inputs, and block transitions only have 1 block.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants