Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally Implement Zeroize on exported types #309

Merged
merged 3 commits into from
Jul 16, 2023

Conversation

elichai
Copy link
Contributor

@elichai elichai commented Jun 6, 2023

Implements the zeroize::Zeroize on exported types, this is guarded by a zeroize feature.
This doesn't zeroize on drop, it just let's the user derive themselves / zeroize when needed

Cargo.toml Outdated
@@ -88,6 +88,7 @@ constant_time_eq = "0.2.4"
rayon = { version = "1.2.1", optional = true }
cfg-if = "1.0.0"
digest = { version = "0.10.1", features = [ "mac" ], optional = true }
zeroize = { version = "1", features = ["zeroize_derive"], optional = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need default-features = false here, so that this doesn't pull in alloc? And then we should also make our own std feature above pull in zeroize/std (just like it's currently doing with digest/std). Does that sound right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will enable this also when zeroize is not enabled (that's actually what happens now with digest)
if we want to do this (which I'm not sure we want, why should you enable std on a crate if you don't use anything it adds?) we need the weak dependencies feature which was stabilized in 1.60 (rust-lang/cargo#10269) this will allow us to do std = [zeroize?/std] which will only enable the feature of the dep if the dep is actually enabled

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's a good point. The std features of those crates are independent of what we're deriving, and the caller can depend on that feature themselves if they need it. Do you think default-features = false still makes sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll add that

src/lib.rs Outdated
self.cv_stack.capacity(),
)
};
uninit_cv_stack.zeroize();
Copy link
Member

@oconnor663 oconnor663 Jun 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if either Zeroize or ArrayVec would consider supporting the other directly. They're both pretty popular crates, and this seems like the sort of sharp edge that a lot of folks will run into.

In any case, I think we could do this safely by just appending a bunch of all-zero CVs to the array until .is_full(), and then zeroizing a mutable slice over the whole ArrayVec. Annoyingly, that doesn't seem to work directly on the slice type, because Zeroize over slices depends on DefaultIsZeroes, and arrays of ints don't implement DefaultIsZeroes (no idea why). But we can work around that by casting the slice into an array reference, like this:

while !self.cv_stack.is_full() {
    self.cv_stack.push([0; 32]);
}
let cv_stack_array: &mut [CVBytes; MAX_DEPTH + 1] =
    (&mut self.cv_stack[..]).try_into().unwrap();
cv_stack_array.zeroize();
self.cv_stack.clear();

What do you think?

Whatever we decide to do here, we should definitely add a test to exercise it :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I can make a PR to ArrayVec to see if they'll land it.
  2. General arrays don't implement DefaultIsZeroes because they don't implement Default (They implement up to [T; 32] ).
  3. I like your no unsafe solution, I did not know you can try_into() a reference to a slice into a reference to an array (I thought only owned values are possible) :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooo it looks like number 1 might work out, that's awesome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah! I think I'll leave this PR as is for a few days to wait for arrayvec as there's no rush here

@elichai
Copy link
Contributor Author

elichai commented Jul 9, 2023

@oconnor663 Replaced the unsafe with the implementation from bluss/arrayvec#240 and added a test that checks everything is actually zero (the cv_stack is truncated to 0 length, I could add a check that all it's content is also 0 but that will require some unsafe and assumes the underlying memory is not uninitialized(which it shouldn't be))
Also removed the digest/std from the std feature and added no-default-features to zeroize

@oconnor663
Copy link
Member

This is looking really good. I'm also working on a big refactoring that'll probably get rid of the platform members that have to be skipped here, and that'll clean things up even more when it lands.

but that will require some unsafe and assumes the underlying memory is not uninitialized

I think the robust way to do this is to use std::mem::zeroed or similar to create the object. I'm totally fine with unsafe code in tests if you'd like to do this. We could even cast the whole thing to [u8; mem::size_of::<Hasher>()] afterwards and assert each byte :)

@elichai
Copy link
Contributor Author

elichai commented Jul 10, 2023

I think the robust way to do this is to use std::mem::zeroed or similar to create the object. I'm totally fine with unsafe code in tests if you'd like to do this. We could even cast the whole thing to [u8; mem::size_of::<Hasher>()] afterwards and assert each byte :)

I'm concerned this might be UB if Hasher contains padding bytes, even if we wrote into them (which Zeroize doesn't promise to zero out padding) they might not be preserved

But checking the content of cv_stack should be fine as all its buffer had zeros written into them

@oconnor663
Copy link
Member

oconnor663 commented Jul 15, 2023

Hmm that's an interesting point. (I'm definitely going down a rabit hole here but I do love this rabbit hole :-D) The assignment of the return value of mem::zeroed isn't guaranteed to be semantically a memcpy of the whole thing; it could be a field-by-field copy instead. We can reason that Hasher isn't going to contain any padding bytes, so in all likelihood this is a distinction without a difference, but it's not a #[repr(C)] struct so we're not supposed to do that. Any even if we use MaybeUninit to create a place that definitely starts as all-zeros (is that even true? under the covers it's an assignment of a union...), we might copy uninitialized bytes into it if assignment is a memcpy (the opposite case). Maybe there isn't a sound way to do this, without the language either specifying the exact semantics of assignment (undesirable from an optimization perspective probably) or providing some sort of mem::freeze API.

@elichai
Copy link
Contributor Author

elichai commented Jul 16, 2023

Hmm that's an interesting point. (I'm definitely going down a rabit hole here but I do love this rabbit hole :-D) The assignment of the return value of mem::zeroed isn't guaranteed to be semantically a memcpy of the whole thing; it could be a field-by-field copy instead. We can reason that Hasher isn't going to contain any padding bytes, so in all likelihood this is a distinction without a difference, but it's not a #[repr(C)] struct so we're not supposed to do that. Any even if we use MaybeUninit to create a place that definitely starts as all-zeros (is that even true? under the covers it's an assignment of a union...), we might copy uninitialized bytes into it if assignment is a memcpy (the opposite case). Maybe there isn't a sound way to do this, without the language either specifying the exact semantics of assignment (undesirable from an optimization perspective probably) or providing some sort of mem::freeze API.

Yeah I don't see how this can be done soundly, even with mem::freeze we'd need to distinguish padding(might be non-zero) bytes from non-padding(has to be zero) bytes. unless we manually make sure there are no padding bytes on any architecture and then we can transmute and check

@oconnor663 oconnor663 merged commit e302cdf into BLAKE3-team:master Jul 16, 2023
@oconnor663
Copy link
Member

Ok theory aside, this looks great. Landed. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants