Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

array_to_bounded_vec is dangerous #9854

Closed
nventuro opened this issue Nov 8, 2024 · 11 comments · Fixed by #10556
Closed

array_to_bounded_vec is dangerous #9854

nventuro opened this issue Nov 8, 2024 · 11 comments · Fixed by #10556
Labels
A-security Area: Relates to security. Something is insecure. C-protocol-circuits Component: Protocol circuits (kernel & rollup) team-turing Leila's team

Comments

@nventuro
Copy link
Contributor

nventuro commented Nov 8, 2024

This is utils::arrays::array_to_bounded_vec, which is used throughout the kernel data builders:

pub fn array_to_bounded_vec<T, let N: u32>(array: [T; N]) -> BoundedVec<T, N>
where
    T: Empty + Eq,
{
    let mut len = 0;
    for elem in array {
        if !is_empty(elem) {
            len += 1;
        }
    }

    BoundedVec { storage: array, len }
}

Note how array is not mutated at all: the implementation assumes that all empty values are located at the end of the array, and attempts to only count the non-empty ones. This is extremely dangerous for two reasons:

a) if array were to have even a single empty value with a lower index than any non-empty value, the non-empty value would be lost, and BoundedVec's storage would be corrupt
b) if T::empty() returns true for any value that is not std::mem::zeroed(), BoundedVec's storage will again be corrupt. This is because BoundedVec assumes that unused storage elements are cleared, and blindly compares all of storage in its eq implementation. An empty value that is not zeroed would cause for the resulting BoundedVec to not be equal to one that is created by inserting all of its elements into a new one, since the 'empty' storage would not be the same.

Additionally, array_to_bounded_vec does not convey this handling of empty values, and seems to imply the function would behave the same as BoundedVec::from_array, which it does not.

I stumbled upon this when clearing warnings: this function emits some as BoundedVec's storage and len properties are private as they are not meant to be directly accessed (due to issues like the ones described here).

@nventuro nventuro added A-security Area: Relates to security. Something is insecure. C-protocol-circuits Component: Protocol circuits (kernel & rollup) team-turing Leila's team labels Nov 8, 2024
@nventuro
Copy link
Contributor Author

nventuro commented Nov 8, 2024

I also noticed problematic usage in NullifierReadRequestHintsBuilder::new:

pending_read_hints: BoundedVec {
    storage: [
        PendingReadHint::nada(MAX_NULLIFIER_READ_REQUESTS_PER_TX); NUM_PENDING_HINTS
    ],
    len: 0,
},

where storage is not zeroed out, but the vec's length is zero. I don't understand why the above isn't simply doing ::new(), but it does result in a vec of length 0 that is not equal to an equivalent vector of length 0.

@iAmMichaelConnor
Copy link
Contributor

Sounds like it just needs a rename? I think it's intentionally designed with the assumption that the data will be tightly packed to the "left". Agree it's not clear at the moment.
There probably needs to be an additional function with the more intuitive functionality you desire.
There's also filter_array_to_bounded_vec (although that has a bug which is fixed here, but never merged: https://github.com/AztecProtocol/aztec-packages/pull/7909/files#diff-886300a06439a6bfbaafebbf4c87aede0d596e687adc8034db18e5c01c06ffb7)

@sirasistant
Copy link
Collaborator

I also noticed problematic usage in NullifierReadRequestHintsBuilder::new:

I think this is because an empty (all zeroes) PendingReadHint is valid, I don't know if anything would break if changed to new() though

@TomAFrench
Copy link
Member

Hey, is there any progress on this? The protocol circuits are making pretty heavy use of reading and writing to BoundedVec.storage currently which is blocking us from enforcing visibility rules.

@sirasistant
Copy link
Collaborator

sirasistant commented Dec 3, 2024

Could we have new_unchecked with storage and len in the same spirit as get_unchecked and set_unchecked? that would ease the migration

@jfecher
Copy link
Contributor

jfecher commented Dec 3, 2024

Could we have new_unchecked with storage and len in the same spirit as get_unchecked and set_unchecked? that would ease the migration

Do they have to be unchecked? Would a checked from_parts constructor suffice?

fn from_parts(array: [T; MaxLen], len: u32) -> BoundedVec<T, MaxLen> {
    assert(len <= MaxLen);
    BoundedVec { storage: array, len }
}

@sirasistant
Copy link
Collaborator

Unchecked in the sense that we don't iterate over the array again to verify that all items after len are zeroed. I think checking against the capacity is fine!

@nventuro
Copy link
Contributor Author

nventuro commented Dec 3, 2024

That from_parts would potentially leave initialized storage elements past len, making the vec not equal a vec with identical len and content (since eq blindly compares storage).

@jfecher
Copy link
Contributor

jfecher commented Dec 3, 2024

making the vec not equal a vec with identical len and content (since eq blindly compares storage).

Right, forgot about that change. So there's a tradeoff here. I could change the API so that we no longer check the entire self.storage on eq. We'd no longer need to zero an element in pop either. However, this means that eq would have to only check until self.len which would be a witness compared to the capacity. So eq code goes from

self.storage[0] == other.storage[0]
  & self.storage[1] == other.storage[1]
  ...
  & self.storage[MaxLen-1] == other.storage[MaxLen-1]

to

if 0 < self.len {
    self.storage[0] == other.storage[0]
} &
if 1 < self.len {
    self.storage[1] == other.storage[1]
} &
...
if MaxLen - 1 < self.len {
    self.storage[MaxLen-1] == other.storage[MaxLen-1]
}

@TomAFrench
Copy link
Member

That's going to make any equality checks on boundedvecs very expensive unless the length is known at compile time.

@jfecher
Copy link
Contributor

jfecher commented Dec 3, 2024

Yes, that's the tradeoff.
I think at the end of the day, in most code eq is more common than from_array and pop would be so we should keep the current version.

@LeilaWang LeilaWang linked a pull request Dec 10, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-security Area: Relates to security. Something is insecure. C-protocol-circuits Component: Protocol circuits (kernel & rollup) team-turing Leila's team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants