Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review unconstrained ContractClassRegisterer log hash #8978

Open
Tracked by #8954
nventuro opened this issue Oct 2, 2024 · 2 comments
Open
Tracked by #8954

Review unconstrained ContractClassRegisterer log hash #8978

nventuro opened this issue Oct 2, 2024 · 2 comments
Assignees
Labels
A-security Area: Relates to security. Something is insecure. C-aztec.nr Component: Aztec smart contract framework team-fairies Nico's team

Comments

@nventuro
Copy link
Contributor

nventuro commented Oct 2, 2024

The ContractClassRegisterer calls emit_contract_class_unencrypted_log_private, which returns the hash of the bytecode registration log from an oracle. This value is not constrained, because hashing such a large bytecode would result in a circuit so large that Noir is unable to handle it (or rather was unable a couple months ago - this may have changed).

The code hints at this not being an issue because we've committed to the bytecode anyway, but I think it may be missing something.

The point of the registerer is to ensure that a contract's bytecode is available: if a public call is made to a contract, then the sequencer must execute its code which of course requires knowing what it is in the first place. The kernel circuits assume that the sequencer is listening to contract class registration events, and therefore knows the bytecode if the contract was ever registered. The sequencer is only allowed to not make the public call if it can prove nullifier non-inclusion, i.e. non-registration, since in this case there is no guarantee that the bytecode has been distributed.

If the above is correct, then not constraining the log hash can be very dangerous: a malicious user could deploy a contract and inject a bad log hash via a malicious oracle. Sequencers that do not know the bytecode would then be unable to produce proofs for transactions that include calls to this contract.

@github-project-automation github-project-automation bot moved this to Todo in A3 Oct 2, 2024
@nventuro nventuro added C-aztec.nr Component: Aztec smart contract framework A-security Area: Relates to security. Something is insecure. team-fairies Nico's team labels Oct 2, 2024
@benesjan benesjan self-assigned this Oct 4, 2024
@benesjan
Copy link
Contributor

benesjan commented Oct 4, 2024

This is how verification of log delivery works:

The logs are never hashed in protocol circuits as they are too big. For this reason we only pass log hashes through kernels. The logs are then delivered completely unverified in a Tx object to a sequencer and the sequencer re-hashes them out of circuit and checks that they match hash which was on the output of kernels.

If the hash matches, sequencer includes the tx in a block and submits the block to a contract on L1. The contract then re-hashes the logs and checks that the resulting log hash matches the hash on the output of circuits (this log hash is called txs effects hash as it's used to verify all tx effects).

Given that the bytecode log is not hashed in circuit then an attacker could do the following:

  1. return a log hash corresponding to a gibberish on the output of the oracle,
  2. send that giberrish in the tx object to a sequencer.

The hash of the gibberish would match the hash passed through the circuits so the sequencer would happily include the tx in a block and the L1 contract would then happily hash the gibberish, get a matching hash on the input and the contract would be proclaimed deployed!

I don't fully understand how and where do we commit to the bytecode as I am not that knowledgeable about the deploy flow:

The code hints at this not being an issue because we've committed to the bytecode anyway, but I think it may be missing something.

But it looks like we don't ever check in-circuit that this log_hash matches the bytecode commitment and hence the bytecode is not guaranteed to be delivered! (the attacker could deliver random blob of data instead)

This seems like a potential DoS vector because if a sequencer would receive a tx calling this contract then a sequencer would try to simulate execution of the random bytecode which would inevitably lead to some undefined behavior. The sequencer would not manage to charge a fee for this because he would not manage to prove that the tx reverted (or whatever would happen).

nventuro added a commit that referenced this issue Oct 7, 2024
This is quite minor, I thought it was going to be larger than it is but
then I ran into #8978 and was distracted.

---------

Co-authored-by: Jan Beneš <[email protected]>
@nventuro
Copy link
Contributor Author

If the hash matches, sequencer includes the tx in a block and submits the block to a contract on L1. The contract then re-hashes the logs and checks that the resulting log hash matches the hash on the output of circuits

From conversations with @sirasistant it looks like this entire flow is going to change in favor of a saner flow in which we simply hash the bytecode at registration time. At that point this issue will be solved. He also confirmed that the current approach is indeed unsafe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-security Area: Relates to security. Something is insecure. C-aztec.nr Component: Aztec smart contract framework team-fairies Nico's team
Projects
Status: Todo
Development

No branches or pull requests

2 participants