Review unconstrained ContractClassRegisterer log hash #8978

nventuro · 2024-10-02T21:11:12Z

The ContractClassRegisterer calls emit_contract_class_unencrypted_log_private, which returns the hash of the bytecode registration log from an oracle. This value is not constrained, because hashing such a large bytecode would result in a circuit so large that Noir is unable to handle it (or rather was unable a couple months ago - this may have changed).

The code hints at this not being an issue because we've committed to the bytecode anyway, but I think it may be missing something.

The point of the registerer is to ensure that a contract's bytecode is available: if a public call is made to a contract, then the sequencer must execute its code which of course requires knowing what it is in the first place. The kernel circuits assume that the sequencer is listening to contract class registration events, and therefore knows the bytecode if the contract was ever registered. The sequencer is only allowed to not make the public call if it can prove nullifier non-inclusion, i.e. non-registration, since in this case there is no guarantee that the bytecode has been distributed.

If the above is correct, then not constraining the log hash can be very dangerous: a malicious user could deploy a contract and inject a bad log hash via a malicious oracle. Sequencers that do not know the bytecode would then be unable to produce proofs for transactions that include calls to this contract.

The text was updated successfully, but these errors were encountered:

benesjan · 2024-10-04T10:42:18Z

This is how verification of log delivery works:

The logs are never hashed in protocol circuits as they are too big. For this reason we only pass log hashes through kernels. The logs are then delivered completely unverified in a Tx object to a sequencer and the sequencer re-hashes them out of circuit and checks that they match hash which was on the output of kernels.

If the hash matches, sequencer includes the tx in a block and submits the block to a contract on L1. The contract then re-hashes the logs and checks that the resulting log hash matches the hash on the output of circuits (this log hash is called txs effects hash as it's used to verify all tx effects).

Given that the bytecode log is not hashed in circuit then an attacker could do the following:

return a log hash corresponding to a gibberish on the output of the oracle,
send that giberrish in the tx object to a sequencer.

The hash of the gibberish would match the hash passed through the circuits so the sequencer would happily include the tx in a block and the L1 contract would then happily hash the gibberish, get a matching hash on the input and the contract would be proclaimed deployed!

I don't fully understand how and where do we commit to the bytecode as I am not that knowledgeable about the deploy flow:

The code hints at this not being an issue because we've committed to the bytecode anyway, but I think it may be missing something.

But it looks like we don't ever check in-circuit that this log_hash matches the bytecode commitment and hence the bytecode is not guaranteed to be delivered! (the attacker could deliver random blob of data instead)

This seems like a potential DoS vector because if a sequencer would receive a tx calling this contract then a sequencer would try to simulate execution of the random bytecode which would inevitably lead to some undefined behavior. The sequencer would not manage to charge a fee for this because he would not manage to prove that the tx reverted (or whatever would happen).

This is quite minor, I thought it was going to be larger than it is but then I ran into #8978 and was distracted. --------- Co-authored-by: Jan Beneš <[email protected]>

nventuro · 2024-10-10T10:06:45Z

If the hash matches, sequencer includes the tx in a block and submits the block to a contract on L1. The contract then re-hashes the logs and checks that the resulting log hash matches the hash on the output of circuits

From conversations with @sirasistant it looks like this entire flow is going to change in favor of a saner flow in which we simply hash the bytecode at registration time. At that point this issue will be solved. He also confirmed that the current approach is indeed unsafe.

github-project-automation bot moved this to Todo in A3 Oct 2, 2024

github-project-automation bot added this to A3 Oct 2, 2024

nventuro mentioned this issue Oct 2, 2024

feat(Event_Logs): Remove unencrypted emit in private #7161

Closed

nventuro added C-aztec.nr Component: Aztec smart contract framework A-security Area: Relates to security. Something is insecure. team-fairies Nico's team labels Oct 2, 2024

This was referenced Oct 2, 2024

chore: add comments to log oracles #8981

Merged

Epic: Logs Refactor #8954

Open

benesjan self-assigned this Oct 4, 2024

nventuro added a commit that referenced this issue Oct 7, 2024

chore: add comments to log oracles (#8981)

8efa7ac

This is quite minor, I thought it was going to be larger than it is but then I ran into #8978 and was distracted. --------- Co-authored-by: Jan Beneš <[email protected]>

nventuro mentioned this issue Nov 4, 2024

Epic: Securing contract classes #9476

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review unconstrained ContractClassRegisterer log hash #8978

Review unconstrained ContractClassRegisterer log hash #8978

nventuro commented Oct 2, 2024 •

edited

Loading

benesjan commented Oct 4, 2024

nventuro commented Oct 10, 2024

Review unconstrained ContractClassRegisterer log hash #8978

Review unconstrained ContractClassRegisterer log hash #8978

Comments

nventuro commented Oct 2, 2024 • edited Loading

benesjan commented Oct 4, 2024

nventuro commented Oct 10, 2024

nventuro commented Oct 2, 2024 •

edited

Loading