Update documentation to clarify terminology around block verification #6681

mpguerra · 2023-05-15T09:17:14Z

Motivation

The verification parts of the request, state service, and state consensus rule checks use incorrect terms that are very confusing for developers and auditors.

The storage parts use those terms correctly and should be left alone.

See the comments of this ticket for potential changes.

Background Notes

Here is some of the confusion in this auditor note that we should resolve in our documentation:

Checkpoint vs Finalized

Note that a checkpointed/finalized block can be a settled network upgrade or a block beyond the rollback/reorg limit. The state service processes the finalized block and queues it to be committed to state

This is the core of the confusion:

a settled network upgrade is specified as a network upgrade that has "social consensus": it has enough confirmations on enough nodes that it can't be rolled back
checkpointed blocks are a Zebra implementation detail: every release, we generate checkpoints for all finalized blocks in the chain, because those blocks can't be rolled back by Zebra, and are very unlikely to be be rolled back by zcashd. This includes settled network upgrades, but it also goes beyond them to include almost all settled blocks.
finalized blocks can come from checkpoints, or from a contextually verified block which is beyond the rollback limit
the state service processes:
- checkpoint verified blocks direct to the finalized state
- contextually verified blocks into the non-finalized state, and then into the finalized state once they reach the rollback limit (or if they are on a side chain that will never be committed to the finalized state, they get dropped)

Verified Block Cache Behaviour

(finalized blocks can be optionally saved to disk.)

All valid finalized blocks are saved to disk, that's what finalization means in Zebra. Saving to disk is not under user control, and it's not a choice the application makes ~~automatically~~ dynamically. The only thing that can stop a finalized block being saved to disk is invalid data, which causes a verification failure in the block commitment, authorization, or chain history roots.

Non-finalized blocks are saved to disk if they are part of the best chain, and beyond the rollback limit. If they are not part of the best chain and beyond the rollback limit, they are dropped.

User Confusion about Caching

There's also some potential user confusion here between:

thin clients: programs that don't download or store the whole chain, and skip some validation
full nodes: programs that download the whole chain, fully validate it, and store it

We've had multiple requests for Zebra to become a thin client, so it's important for us to avoid confusion in our user documentation.

The ephemeral config only controls state cleanup behaviour, it doesn't impact block validation or storage. Regardless of its setting, Zebra still functions correctly as a fully validating node. (It just doesn't have a cache when it next starts up, which is an entirely valid state.)

Semanically Verified vs Non-Finalized

the state service will validate and commit the non-finalized queued blocks whose parents have recently arrived

To do contextual validation, the state service will:

queue blocks whose ancestors have not arrived in the state yet
send queued blocks to the block write task after all their ancestors have been sent
validate blocks as they arrive in the block write task, as long as their ancestors were valid
commit valid blocks to the non-finalized state

Breadth-first order is not guaranteed, because a lower height block from another fork can arrive at any time. Forks are allowed from the finalized tip, and any non-finalized block.

For the first non-finalized block to be validated there must be at least 28 blocks in the finalized state.

The 28 most recent ancestor blocks must have been validated and committed by the state, regardless of whether they are finalized or non-finalized.

If an honest node fails to obtain enough finalized blocks from its peers during syncing, an attacker could send it a non-finalized block to verify, which will fail this assertion and crash the Zebra node.

This is incorrect, due to the confusion above.

Verification Order

The Zebra implementation chooses to verify blocks using checkpoints or semantic/contextual verification, based on their heights and its checkpoints. Checkpoint verification happens in strict height order, in both zebra-consensus and zebra-state. There is only one chain fork when checkpointing.

Semantic verification happens in parallel in the zebra-consensus verifiers, but contextual verification happens in chain order in zebra-state. Each chain fork is verified in height order, different forks can have different tip heights. (Different chain forks could also be contextually verified in parallel, but Zebra doesn't implement that.)

Guaranteed Chain Context

So blocks are not validated by the state until all their ancestors have been validated. If a contextually verified block's ancestors aren't validated, it will wait in the queue and timeout, or be rejected with an error when it is received by the block write task. So validate_and_commit_non_finalized() won't be called unless a block has at least 28 ancestors, because it is only called during contextual validation. And before contextual validation starts, Zebra commits at least 1 million blocks using checkpoints.

No Cross-Module Dependencies

In most cases, the syncer or gossip downloaders will reject blocks that are too far in front of the tip. This prevents CPU and memory denial of service. But peers can't make Zebra panic by submitting blocks out of order, even if they convince these services to submit those blocks by the state. If that happens, the state will timeout or reject the block.

Originally posted by @teor2345 in #6620 (comment)

The text was updated successfully, but these errors were encountered:

mpguerra · 2023-05-15T12:27:22Z

Hey team! Please add your planning poker estimate with Zenhub @arya2 @conradoplg @dconnolly @oxarbitrage @teor2345 @upbqdn

teor2345 · 2023-05-15T19:21:52Z

@mpguerra I think these changes will be easier after the renames in #6680, they are also likely to cause merge conflicts if we do them at the same time.

mpguerra · 2023-06-07T13:38:21Z

What do we actually need to do for this issue?

teor2345 · 2023-06-07T21:00:29Z

What do we actually need to do for this issue?

I think we should finish the missed renames in #6793 first, then add the information in this ticket to the state requests/service/queues/non-finalized state/finalized state. (Wherever each paragraph is most relevant.)

teor2345 · 2023-06-11T23:40:15Z

This documentation on CheckpointVerifiedBlock is incorrect, the note below it lists the checks that are actually required:

zebra/zebra-state/src/request.rs

Lines 214 to 215 in 2e37981

/// A block ready to be committed directly to the finalized state with

/// no checks.

Similarly, the zebra_consensus::checkpoint module docs need to be updated: most checks are skipped, but some are still needed:

zebra/zebra-consensus/src/checkpoint.rs

Lines 7 to 11 in 2e37981

    
           //! The checkpoint verifier queues pending blocks.  Once there is a 
        
           //! chain from the previous checkpoint to a target checkpoint, it 
        
           //! verifies all the blocks in that chain, and sends accepted blocks to 
        
           //! the state service as finalized chain state, skipping contextual 
        
           //! verification checks.

teor2345 · 2023-06-12T00:10:54Z

teor2345 · 2023-06-12T00:13:27Z

What do we actually need to do for this issue?

The verification parts of the request, state service, and consensus rule checks use incorrect terms that are very confusing for developers and auditors. The storage parts use those terms correctly and should be left alone.

teor2345 · 2023-06-12T00:35:38Z

The documentation and variable names in these types and their methods:

@arya2 can you do a quick check of the list in this comment:
#6681 (comment)

Feel free to change what we should rename, or what the new names should be.

If you think any of these renames can be done automatically, feel free to add them to #6793. But the old names have to be unique enough that there's no mistakes, and the replacement must be universal.

mpguerra added this to Zebra May 15, 2023

github-project-automation bot moved this to 🆕 New in Zebra May 15, 2023

mpguerra added A-docs Area: Documentation P-Medium ⚡ C-audit Category: Issues arising from audit findings labels May 15, 2023

mpguerra mentioned this issue May 15, 2023

Epic: Improvements from Zebra Audit #6277

Closed

36 tasks

mpguerra added the S-blocked Status: Blocked on other tasks label May 16, 2023

teor2345 mentioned this issue Jun 4, 2023

Renames to address confusion in zebra's handling of finalized state #6680

Closed

13 tasks

teor2345 mentioned this issue Jun 12, 2023

fix(state): Avoid temporary failures verifying the first non-finalized block or attempting to fork the chain before the final checkpoint #6810

Merged

6 tasks

mpguerra added S-blocked Status: Blocked on other tasks and removed S-blocked Status: Blocked on other tasks labels Jun 15, 2023

mpguerra assigned oxarbitrage Jun 19, 2023

oxarbitrage mentioned this issue Jun 25, 2023

docs(state): Use different terms for block verification and state queues #7061

Merged

6 tasks

mergify bot closed this as completed in #7061 Jul 4, 2023

github-project-automation bot moved this from 🆕 New to ✅ Done in Zebra Jul 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update documentation to clarify terminology around block verification #6681

Update documentation to clarify terminology around block verification #6681

mpguerra commented May 15, 2023 •

edited by teor2345

Loading

mpguerra commented May 15, 2023

teor2345 commented May 15, 2023

mpguerra commented Jun 7, 2023

teor2345 commented Jun 7, 2023

teor2345 commented Jun 11, 2023 •

edited

Loading

teor2345 commented Jun 12, 2023 •

edited

Loading

teor2345 commented Jun 12, 2023

teor2345 commented Jun 12, 2023

Update documentation to clarify terminology around block verification #6681

Update documentation to clarify terminology around block verification #6681

Comments

mpguerra commented May 15, 2023 • edited by teor2345 Loading

Motivation

Background Notes

Checkpoint vs Finalized

Verified Block Cache Behaviour

User Confusion about Caching

Semanically Verified vs Non-Finalized

Verification Order

Guaranteed Chain Context

No Cross-Module Dependencies

mpguerra commented May 15, 2023

teor2345 commented May 15, 2023

mpguerra commented Jun 7, 2023

teor2345 commented Jun 7, 2023

teor2345 commented Jun 11, 2023 • edited Loading

teor2345 commented Jun 12, 2023 • edited Loading

teor2345 commented Jun 12, 2023

teor2345 commented Jun 12, 2023

mpguerra commented May 15, 2023 •

edited by teor2345

Loading

teor2345 commented Jun 11, 2023 •

edited

Loading

teor2345 commented Jun 12, 2023 •

edited

Loading