Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Truncate sha-256 instead of splitting output into two Fr fields #2019

Closed
jeanmon opened this issue Sep 5, 2023 · 1 comment · Fixed by #5160
Closed

Truncate sha-256 instead of splitting output into two Fr fields #2019

jeanmon opened this issue Sep 5, 2023 · 1 comment · Fixed by #5160
Assignees
Labels
C-protocol-circuits Component: Protocol circuits (kernel & rollup) T-refactor Type: this code needs refactoring

Comments

@jeanmon
Copy link
Contributor

jeanmon commented Sep 5, 2023

  • Let the new design be vetted by another pair of eyes (e.g.: Patrick)
  • Let us consider whether Keccak or Blake is a potential alternative. (We do need the hash function to be 'Ethereum compatible' though)
@jeanmon jeanmon added the C-protocol-circuits Component: Protocol circuits (kernel & rollup) label Sep 5, 2023
@jeanmon jeanmon added this to A3 Sep 5, 2023
@github-project-automation github-project-automation bot moved this to Todo in A3 Sep 5, 2023
@iAmMichaelConnor iAmMichaelConnor added the T-refactor Type: this code needs refactoring label Sep 6, 2023
@dan-aztec
Copy link
Contributor

dan-aztec commented Jan 4, 2024

is it OK to use the stdlib hash_to_field, which is based on blake2s, or do we want to keep sha256?

Will start with just truncating sha256. I see also that calldata hash and L1toL2Messages hash can also be reduced to a single Field element in addition to Encrypted/Unencrypted Logs hash

@LeilaWang LeilaWang assigned Thunkar and unassigned LeilaWang Jan 16, 2024
MirandaWood added a commit that referenced this issue Mar 21, 2024
Will close #2019

This PR converts SHA hashing inside noir circuits from outputting 2
128-bit fields to outputting 1 248-bit field. To fit inside the field,
we truncate one byte.

---
### Noir Changes

The constant `NUM_FIELDS_PER_SHA256` is now 1, so any hardcoded test
values and function returns have been changed to use an array of one.
I've kept it as an array rather than a single `Fr` to minimise changes
across the repo and ensure if we want to revert `NUM_FIELDS_PER_SHA256`
in future, it won't be so painful. However, we can also just use a
single `Fr` if that's preferable.

`TX_EFFECTS_HASH_LOG_FIELDS`

Methods:

- `field_from_bytes_32_trunc`: Converts a 32 byte array to a 31 byte
field element (useful for comparisons with new `sha256_to_field`), tests
in `types/src/utils/field.nr`.
- `sha256_to_field`: Uses the same method as the previous version to
convert the sha result (BE) bytes array to field, but leaves out the
final byte.
- `accumulate_sha256`: Used almost exclusively for enc/unenc logs
hashing - takes in 2 31 byte field elements, assumed to be outputs of a
previous sha hash, pads to 32 bytes and hashes them with
`sha256_to_field` as a 64 byte array. Note that as before, other
circuits that use sha (like tx effects hash and messages hash) do not
use this method and instead create a flat byte array, then call
`sha256_to_field`.

---
### L1 Contract Changes

To match the Noir method, the `sha256ToField` function now truncates a
byte and prepends a blank byte. Not prepending the blank byte means
changing many struct fields from `bytes32` to `bytes31`. This (IIRC) is
the same gas cost and creates more awkward encoding, so I kept the
length with a blank byte. This also changes the slither file, as I
removed some of the old encoding which flagged with new encoding...
which also flags.

~Only the 'leaves' used in computing the `txsHash` in `TxsDecoder` and
logs hashes have been changed to 31 bytes to match the Noir SHA
accumulation (since we are repeating hashes of hashes).~

~The TS code (see below) does pack the Header struct with 31 bytes per
SHA, so we must shift the decoding in HeaderLib` by 3 bytes.~

As of 21.3, there have been a lot of changes in master to the way the
txs effect hash (formerly calldata hash/txs hash) is calculated. Plus,
now we actually recalculate the in/outHash (i.e. the root of the sha
tree of messages) in the contract, so I have reverted to using 32 bytes
everywhere with a prepended blank byte.

---
### TS Changes

All `.hash()` methods which are also computed in the circuit have been
changed to match the Noir code. In most places this just means
truncating a byte with `.subarray(0, 31)` on the buffer.
~The `ContentCommitment` serialise/deserialise methods have been
modified, as keeping `NUM_BYTES_PER_SHA256 = 32` caused a lot of issues
in the background. Changing it to 31 to match Noir does mean slightly
different encoding, but many fewer changes across the repo (and
hopefully less confusion).~
As of 21.3, due to changes in master, it's now cleaner to keep
`NUM_BYTES_PER_SHA256 = 32` and be sure to truncate and pad all SHA
hashes which touch the Noir circuits.
Since I've kept the hash output as an array of one in Noir, there are
many tuples of one in ts (for the above reasoning) - this can be changed
if preferable.

Methods:

- `toTruncField`: Mirrors Noir's `field_from_bytes_32_trunc` to convert
to a field element - used in place of old method `to2Fields` (tested in
`free_funcs.test.ts`).
- `fromTruncField`: Converts the above back to a 31 byte buffer (tested
as above).

---
@github-project-automation github-project-automation bot moved this from Todo to Done in A3 Mar 21, 2024
sklppy88 pushed a commit that referenced this issue Mar 22, 2024
Will close #2019

This PR converts SHA hashing inside noir circuits from outputting 2
128-bit fields to outputting 1 248-bit field. To fit inside the field,
we truncate one byte.

---
### Noir Changes

The constant `NUM_FIELDS_PER_SHA256` is now 1, so any hardcoded test
values and function returns have been changed to use an array of one.
I've kept it as an array rather than a single `Fr` to minimise changes
across the repo and ensure if we want to revert `NUM_FIELDS_PER_SHA256`
in future, it won't be so painful. However, we can also just use a
single `Fr` if that's preferable.

`TX_EFFECTS_HASH_LOG_FIELDS`

Methods:

- `field_from_bytes_32_trunc`: Converts a 32 byte array to a 31 byte
field element (useful for comparisons with new `sha256_to_field`), tests
in `types/src/utils/field.nr`.
- `sha256_to_field`: Uses the same method as the previous version to
convert the sha result (BE) bytes array to field, but leaves out the
final byte.
- `accumulate_sha256`: Used almost exclusively for enc/unenc logs
hashing - takes in 2 31 byte field elements, assumed to be outputs of a
previous sha hash, pads to 32 bytes and hashes them with
`sha256_to_field` as a 64 byte array. Note that as before, other
circuits that use sha (like tx effects hash and messages hash) do not
use this method and instead create a flat byte array, then call
`sha256_to_field`.

---
### L1 Contract Changes

To match the Noir method, the `sha256ToField` function now truncates a
byte and prepends a blank byte. Not prepending the blank byte means
changing many struct fields from `bytes32` to `bytes31`. This (IIRC) is
the same gas cost and creates more awkward encoding, so I kept the
length with a blank byte. This also changes the slither file, as I
removed some of the old encoding which flagged with new encoding...
which also flags.

~Only the 'leaves' used in computing the `txsHash` in `TxsDecoder` and
logs hashes have been changed to 31 bytes to match the Noir SHA
accumulation (since we are repeating hashes of hashes).~

~The TS code (see below) does pack the Header struct with 31 bytes per
SHA, so we must shift the decoding in HeaderLib` by 3 bytes.~

As of 21.3, there have been a lot of changes in master to the way the
txs effect hash (formerly calldata hash/txs hash) is calculated. Plus,
now we actually recalculate the in/outHash (i.e. the root of the sha
tree of messages) in the contract, so I have reverted to using 32 bytes
everywhere with a prepended blank byte.

---
### TS Changes

All `.hash()` methods which are also computed in the circuit have been
changed to match the Noir code. In most places this just means
truncating a byte with `.subarray(0, 31)` on the buffer.
~The `ContentCommitment` serialise/deserialise methods have been
modified, as keeping `NUM_BYTES_PER_SHA256 = 32` caused a lot of issues
in the background. Changing it to 31 to match Noir does mean slightly
different encoding, but many fewer changes across the repo (and
hopefully less confusion).~
As of 21.3, due to changes in master, it's now cleaner to keep
`NUM_BYTES_PER_SHA256 = 32` and be sure to truncate and pad all SHA
hashes which touch the Noir circuits.
Since I've kept the hash output as an array of one in Noir, there are
many tuples of one in ts (for the above reasoning) - this can be changed
if preferable.

Methods:

- `toTruncField`: Mirrors Noir's `field_from_bytes_32_trunc` to convert
to a field element - used in place of old method `to2Fields` (tested in
`free_funcs.test.ts`).
- `fromTruncField`: Converts the above back to a 31 byte buffer (tested
as above).

---
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-protocol-circuits Component: Protocol circuits (kernel & rollup) T-refactor Type: this code needs refactoring
Projects
Archived in project
5 participants