Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add flag to enable/disable DA inclusion verification #2647

Merged
merged 7 commits into from
Aug 27, 2024

Conversation

dimazhornyk
Copy link
Contributor

@dimazhornyk dimazhornyk commented Aug 13, 2024

What ❔

This PR adds a config to explicitly enable/disable DA verification onchain.

Why ❔

Without this feature, any chain using custom DA had to wait for full inclusion before they could commit a batch even if they were not doing the onchain verification.

Checklist

  • PR title corresponds to the body of PR (we generate changelog entries from PRs).
  • Tests for the changes have been added / updated.
  • Documentation comments have been added / updated.
  • Code has been formatted via zk fmt and zk lint.

@popzxc
Copy link
Member

popzxc commented Aug 14, 2024

Without this feature, any chain using custom DA had to wait for full inclusion before they could commit a batch even if they were not doing the onchain verification.

Could you explain how would this work if, for some reason, inclusion has failed but the batch was committed? The proposed change as-is looks a bit dangerous.

@dimazhornyk
Copy link
Contributor Author

Without this feature, any chain using custom DA had to wait for full inclusion before they could commit a batch even if they were not doing the onchain verification.

Could you explain how would this work if, for some reason, inclusion has failed but the batch was committed? The proposed change as-is looks a bit dangerous.

On the L1, a contract that is responsible for verifying the inclusion data is set in the diamond proxy storage. And the chain operator will be unable to change that easily, most likely it will be controlled by our governance/security council.

Every time a batch is committed, the Executor facet calls it for an inclusion verification.
So this verify_inclusion flag can only allow an operator to avoid the inclusion verification if the L1 contract accepts an empty inclusion proofs, otherwise the commitBatches tx will be reverted.

@popzxc
Copy link
Member

popzxc commented Aug 14, 2024

I understand that. What I meant that even if we don't want to send inclusion data to L1, we still want to somehow make sure that it's persisted in the DA. If I understand things correctly, right now, for example, if a request to publish pubdata was sent, but it (asyncrhonously) failed, it will not be noticed and the sequencer will keep working, even though the DA is now broken.

@dimazhornyk
Copy link
Contributor Author

I understand that. What I meant that even if we don't want to send inclusion data to L1, we still want to somehow make sure that it's persisted in the DA. If I understand things correctly, right now, for example, if a request to publish pubdata was sent, but it (asyncrhonously) failed, it will not be noticed and the sequencer will keep working, even though the DA is now broken.

Ah, I see what you mean. The pubdata is not dispatched asynchronously, the batch won't be committed if the chain runs in validium mode and the inclusion data is null, there is a check here. If the inclusion data is not null (even if it is an empty bytes array, like in TrustMeBro mode) - we can safely commit a batch because we are sure that the pubdata was stored.

@popzxc
Copy link
Member

popzxc commented Aug 15, 2024

we can safely commit a batch because we are sure that the pubdata was stored

But this PR changes that, because we return the inclusion data without actually ever checking that if was persisted.
E.g. what you wrote is correct before this PR, but is no longer correct with enable_onchain_verification = false.
Or am I missing something?

@dimazhornyk
Copy link
Contributor Author

Not really, let me cover the full flow:

  • We seal a batch, get a pubdata, at this point both blob_id and inclusion_data are null
  • da_dispatcher runs a dispatch function, which only succeeds if the blob was actually included in the DA layer and the operator was charged for that, blob_id is set in the database, inclusion_data is still null
  • da_dispatcher runs a poll_for_inclusion function, which checks if the state roots of the blocks with the blobs that were dispatched before are relayed to L1. If yes - set the inclusion_data
  • if enable_onchain_verification = false - we simply skip the polling part, and set the inclusion data to [] (just like noDA client does)

So we don't poll_for_inclusion to check if the blob was persisted, we know that it was because dispatch succeeded, inclusion data needed only to verify it onchain

@popzxc
Copy link
Member

popzxc commented Aug 15, 2024

I see. Then shouldn't it be the property of the client? If there is no L1 confirmation data, I would expect this method of a concrete implementation to return immediately, rather than have a switch in the core (where we don't know if the config value is valid for a particular client).

@dimazhornyk
Copy link
Contributor Author

If there is no L1 confirmation data, I would expect this method of a concrete implementation to return immediately

I'd say it's not about the confirmation data being/not being available in the client. I assume that all the clients will be capable of providing the inclusion proofs for L1 verification.

It's more about the configuration of the chain as a whole - whether you as an operator want to verify the inclusion data or not(it adds some cost, so I assume many will be saving on it). We could move it to the clients' configs, but then we would just reimplement the same if statement in every client.

@popzxc
Copy link
Member

popzxc commented Aug 15, 2024

but then we would just reimplement the same if statement in every client.

The difference here would be that only clients that have no inclusion data would have this if (compared to now, when you can set the wrong config by accident).
Alternatively, we can add a new method to the trait, e.g. fn has_inclusion_data(&self) -> bool which will be used instead of config.

My point is that I want to disallow misconfiguration here (e.g. set this variable to false for some client that has inclusion data).

@dimazhornyk
Copy link
Contributor Author

My point is that I want to disallow misconfiguration here (e.g. set this variable to false for some client that has inclusion data).

But it's not about the client having/not having the inclusion data. The client might have the inclusion data, it's just an operator's choice to either send it or not. What I meant here

I assume that all the clients will be capable of providing the inclusion proofs for L1 verification.

is that I expect every client to be fn has_inclusion_data(&self) -> bool { return true; }
And using the inclusion data is up to the operator of the chain. That's why it seems to me more logical to have it in the server config

@popzxc
Copy link
Member

popzxc commented Aug 16, 2024

it's just an operator's choice to either send it or not

But the contract may reject the batch if it has to have inclusion data, right? That's the problem I want to avoid.

I think I understand now that the setting on L1 isn't mapped 1-1 to the client capabilities (e.g. for a client with inclusion data, we may still ignore it on L1).
If that's the case, can we at least add the validation task similar to commitment mode validation, so that we query the L1 at start, and panic if the config locally doesn't work with L1 config?

@dimazhornyk
Copy link
Contributor Author

dimazhornyk commented Aug 16, 2024

If that's the case, can we at least add the validation task similar to commitment mode validation, so that we query the L1 at start, and panic if the config locally doesn't work with L1 config?

great point, but it can't be done in main branch
the DA contracts are integrated only in the sync-layer-stable branch, and the current contracts used in all the envs are not compatible with that (and don't even support custom DA)

that's why I have another PR into the sync-layer branch that adds some more logic for interacting with L1 contracts, and that's where I can add the safeguard you suggested

consider this PR a solution for those who want to start using 3rd party DA solutions (in testing mode), before the new DA contracts are audited and used widely. they won't run into that misconfiguration because the contracts we have on testnet/mainnet don't check anything yet

Copy link
Member

@popzxc popzxc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get the idea now, and generally OK with it. However, please don't forget to add L1 checks once it's possible. Maybe even add a TODO in the code so it's harder to forget.

A few general comments.

@dimazhornyk dimazhornyk requested a review from popzxc August 16, 2024 15:54
@dimazhornyk dimazhornyk added this pull request to the merge queue Aug 26, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 26, 2024
@dimazhornyk dimazhornyk added this pull request to the merge queue Aug 26, 2024
github-merge-queue bot pushed a commit that referenced this pull request Aug 26, 2024
## What ❔

<!-- What are the changes this PR brings about? -->
<!-- Example: This PR adds a PR template to the repo. -->
<!-- (For bigger PRs adding more context is appreciated) -->

This PR adds a config to explicitly enable/disable DA verification
onchain.

## Why ❔

Without this feature, any chain using custom DA had to wait for full
inclusion before they could commit a batch even if they were not doing
the onchain verification.

## Checklist

<!-- Check your PR fulfills the following items. -->
<!-- For draft PRs check the boxes as you complete them. -->

- [ ] PR title corresponds to the body of PR (we generate changelog
entries from PRs).
- [ ] Tests for the changes have been added / updated.
- [ ] Documentation comments have been added / updated.
- [ ] Code has been formatted via `zk fmt` and `zk lint`.
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 26, 2024
@dimazhornyk dimazhornyk added this pull request to the merge queue Aug 27, 2024
Merged via the queue into main with commit b425561 Aug 27, 2024
49 checks passed
@dimazhornyk dimazhornyk deleted the dz-verify-inclusion-flag branch August 27, 2024 10:43
github-merge-queue bot pushed a commit that referenced this pull request Aug 27, 2024
🤖 I have created a release *beep* *boop*
---


##
[24.22.0](core-v24.21.0...core-v24.22.0)
(2024-08-27)


### Features

* add flag to enable/disable DA inclusion verification
([#2647](#2647))
([b425561](b425561))
* **Base token:** add cbt metrics
([#2720](#2720))
([58438eb](58438eb))
* Change default_protective_reads_persistence_enabled to false
([#2716](#2716))
([8d0eee7](8d0eee7))
* **vm:** Extract oneshot VM executor interface
([#2671](#2671))
([951d5f2](951d5f2))
* **zk_toolbox:** Add holesky testnet as layer1 network
([#2632](#2632))
([d9266e5](d9266e5))


### Bug Fixes

* **api:** `tx.gas_price` field
([#2734](#2734))
([aea3726](aea3726))
* **base_token_adjuster:** bug with a wrong metrics namespace
([#2744](#2744))
([64b2ff8](64b2ff8))
* **eth-sender:** missing Gateway migration changes
([#2732](#2732))
([a4170e9](a4170e9))
* **proof_data_handler:** TEE blob fetching error handling
([#2674](#2674))
([c162510](c162510))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: zksync-era-bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants