Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: We add and track crypto digests to orders and objects #117

Merged
merged 15 commits into from
Jan 6, 2022

Conversation

gdanezis
Copy link
Collaborator

@gdanezis gdanezis commented Jan 4, 2022

We want to the data we store and serve to be 'self-certifying' meaning that:
(1) a client that holds an object or order, can validate its full history all the way down to genesis;
(2) it is possible to validate parts of the history, such as a few object -> order -> object transactions, within the history;

To do this we augment our structures:

  • An order has a digest (sha3_265) over all its contents; and contains including the full reference of objects (incl object digest) input into its execution.
  • An object has a digest (sha3_265) over its full contents; and contains the transaction digest that created or last mutated it.

As a result we have a hash-DAG of the latest objects pointing to transactions, pointing to older objects, all the way back to the genesis hash.

Note: this plan is slightly different from what we documented in the overleaf design document. There we considered adding the transaction digest in the object reference, rather than including it in the object, and adding the object digest to the reference. This does not provide a full chain of evidence, hence the change.

For subsequent PR

Copy link
Collaborator

@sblackshear sblackshear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love to see this!

fastx_programmability/adapter/src/adapter.rs Outdated Show resolved Hide resolved
fastx_programmability/adapter/src/adapter.rs Outdated Show resolved Hide resolved
fastx_programmability/adapter/src/genesis.rs Outdated Show resolved Hide resolved
fastx_types/src/base_types.rs Outdated Show resolved Hide resolved
let hash = Sha3_256::digest(hash_arg.as_slice());

let mut hasher = Sha3_256::default();
// TODO: hasher.update("OBJECT_ID_DERIVE::");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I'd be in favor of having at minimum :

  • a Digest trait,
  • two implementations for (Transaction|Object)Digest, each with a domain separator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is strangely positioned, it refers to the TODO in #58.

fastx_types/src/base_types.rs Outdated Show resolved Hide resolved
fastx_types/src/base_types.rs Outdated Show resolved Hide resolved
fastx_types/src/base_types.rs Outdated Show resolved Hide resolved
@gdanezis gdanezis marked this pull request as draft January 5, 2022 12:14
@gdanezis gdanezis marked this pull request as ready for review January 5, 2022 13:52
Copy link
Contributor

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice and necessary! This looks good, but I have a few Qs.

Is there any particular reason why we are using SHA3? It has poor performance1, but for now I'd just like to be aware if anything led to that choice?

Footnotes

  1. it has ~81% of the throughput of SHA2-256, 44% the throughput of Blake2s, and 6% of that of BLAKE3. I'm tempted to switch to BLAKE3 as soon as we have domain separation thrown in (and not just to see @kchalkias 's face when we do).

let hash = Sha3_256::digest(hash_arg.as_slice());

let mut hasher = Sha3_256::default();
// TODO: hasher.update("OBJECT_ID_DERIVE::");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is strangely positioned, it refers to the TODO in #58.

Comment on lines 320 to 321
pub trait Signable<HashedMessageWriter> {
fn write(&self, to_be_hashed_message: &mut HashedMessageWriter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. renaming:
    I don't quite understand how the HashedMessageWriter is a better name than Hasher or how to_be_hashed is a more intuitive variable name. The way I understand it is the intent is to instantiate this Hasher type with something that extends std::io::Write, whether that's an actual serializer of an instance of digest::Digest. I admit that it's arcane to see how hashers defined there are std::io::Write but they are.
    Am I missing something?

  2. extension:
    It makes a great deal of sense to me to reuse this for a Hashable trait, returning the digest instead of the pub fn digest + manual call to sha3_hashon the two impls above. This could, later, obviate the difficulties in [fastx crypto] audit Object ID derivation #58 with or without BCS.

The trait could look like

pub trait Hashable {

    type Hasher: digest::Digest;
    // here you tell how to give  correct fields to a writer for hashing, which may or may not equal how you serialize
    fn write_into<W: ?Sized + std::io::Write,>(&self, &mut W) -> &mut W;
    
    fn digest(&self) -> Self::Hasher::Output // This output concrete type is a GenericArray<u8, N> for some N, there's a way to constrain it to [u8; 32]
    {
        let mut hasher = Hasher::new();
        write_into(self, hasher).finalize() // add a few constraints and an `.into()` for a [u8;32]
    }
}

The Q / ask is: do we want to tackle this next?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 1. Hasher and write were poor names, because they clash with the std::hash::Hasher and std::io::Write which are related enough to be confused with, but not actually the traits used (we redefine them). Hence why I went for very different names.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the names, to W for the generic writer, since this is a Writer indeed. I will leave part (2) of your comment for a separate PR, as I do not entirely understand it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll show in a PR for #58

@gdanezis
Copy link
Collaborator Author

gdanezis commented Jan 6, 2022

Nice and necessary! This looks good, but I have a few Qs.

Is there any particular reason why we are using SHA3? It has poor performance1, but for now I'd just like to be aware if anything led to that choice?

Footnotes

  1. it has ~81% of the throughput of SHA2-256, 44% the throughput of Blake2s, and 6% of that of BLAKE3. I'm tempted to switch to BLAKE3 as soon as we have domain separation thrown in (and not just to see @kchalkias 's face when we do). leftwards_arrow_with_hook

@sblackshear used it first, so I keep using it. It is secure so I have no concerns right now. We can change it down the line if we pick something else, but all the contract addresses (constants right now) will also change.

Copy link
Contributor

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM! Thanks!

@gdanezis gdanezis merged commit 86c3b48 into main Jan 6, 2022
@gdanezis gdanezis deleted the add-digests branch January 6, 2022 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants