Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIP-0100 | Add test vector file #782

Merged
merged 3 commits into from
Apr 21, 2024

Conversation

Ryun1
Copy link
Collaborator

@Ryun1 Ryun1 commented Mar 13, 2024

Add test vector file and re-make provided example

@Ryun1 Ryun1 added the Correction Fixing minor issue or typo label Mar 13, 2024
@Ryun1 Ryun1 changed the title CIP-100 | Re-canonicalize examples CIP-100 | Re-canonicalize example Mar 13, 2024
@Quantumplation
Copy link
Contributor

So, according to the spec, I believe my original values to be correct, but incomplete.

There are two different hashes to compute:

  • the hash of the body field that gets signed by the participants
  • the hash of the whole document, including the signatures and the hash algorithm, that gets published in the anchor field.

The example provided in the document and via the example files is the former, and my document is missing examples for the latter.

The chain of reasoning here is:

  • the signatories can't sign the hash of the whole document, because of chicken-and-egg: we don't have the signatures needed to produce the whole document to produce the final hash
  • it's not particularly important for the authors to sign off on which authors are going to sign the document (i.e. in order for me to sign, I don't need to know that you are also going to sign); similarly, it's not super important that, as an author, I sign off on what hash algorithm is going to be used to publish it on-chain.
  • So, rather than trying to cherry pick which fields to exclude (i.e. the signatures), we instead define a single field to include, i.e. the body
  • It is also not sufficient to publish the hash of the body on-chain, because it is important that the on chain commitment capture that, at the time of submission, it had all of the valid signatures, and they weren't added after the fact.

So I think the correct thing to do here is to either re-evaluate this decision, or make this distinction clearer and add the test vectors / example hash for the canonicalization of the full document, along side the hash of the body field.

@Quantumplation
Copy link
Contributor

In particular, this section describes the process used:

Canonicalize the whole document according to this specification.

  • Identify the node-ID of the body node
  • Filter the canonicalized document to include the body node, and all its descendents
  • Ensure the file ends in a newline
  • Hash the resulting file with blake2b-256

The tool you used is indeed the one I used as well, and I believe it follows the canonicalization spec.

@Ryun1
Copy link
Collaborator Author

Ryun1 commented Mar 13, 2024

Thank you for the explanation @Quantumplation

make this distinction clearer and add the test vectors / example hash for the canonicalization of the full document, along side the hash of the body field.

I will do this within this PR.
I have made a similar effort for CIP-108 already here.

@kderme
Copy link
Contributor

kderme commented Mar 13, 2024

I don't fully understand how the example.canonical is generated. It seems to have less fields than the canonical form and more than the canonical body. Is it properly defined and if not should we delete the file?

@Quantumplation
Copy link
Contributor

I don't fully understand how the example.canonical is generated. It seems to have less fields than the canonical form and more than the canonical body. Is it properly defined and if not should we delete the file?

It is the canonicalized version that is used for signing, see the comment above and in particular the highlighted section from the spec. As far as I can tell, it is correctly generated.

@kderme
Copy link
Contributor

kderme commented Mar 13, 2024

the highlighted section from the spec

I thought this generates example.body.canonical which differs from example.canonical.
The author name is not part of the body, so it shouldn't be used used for signing right?

@Quantumplation
Copy link
Contributor

Ah ok, I guess I did include both examples, my bad. I'm trying to figure out where example.canonical became incorrect, because I for sure wouldn't have stripped it out by hand or anything...

Regardless, this PR makes a bit more sense. Let me try to reproduce independently to confirm.

@Quantumplation
Copy link
Contributor

Damn, yea, it looks like it was inaccurate as of my commit here:

93999e0#diff-57ee612f4824a274191cfea80c0ae926f7e128c42a2b6d8ba5187065d63f28b9

Not sure what happened, because I remember being very meticulous about generating it exactly because it would be used as a reference 🤔

@Ryun1
Copy link
Collaborator Author

Ryun1 commented Mar 13, 2024

@Quantumplation @kderme

I have added a test vector file with explanation on how to recreate as well as missing intermediate file via a3c46f8.
Please let me know what you think.

When I have time I think I will redo the example with a proper signature as the abcd is getting to me 😆.

@Ryun1 Ryun1 changed the title CIP-100 | Re-canonicalize example CIP-100 | Add test vector file Mar 13, 2024
@Quantumplation
Copy link
Contributor

When I have time I think I will redo the example with a proper signature as the abcd is getting to me 😆.

If you'd like, I can do that this weekend with my actual keys, since I'm the one listed in the authors :)

@Ryun1
Copy link
Collaborator Author

Ryun1 commented Mar 14, 2024

If you'd like, I can do that this weekend with my actual keys, since I'm the one listed in the authors :)

That would be great, thank you

@kderme
Copy link
Contributor

kderme commented Mar 14, 2024

Possibly there is a bigger issue of circular dependencies when trying to create signatures. I've opened a different issue #783

@rphair rphair changed the title CIP-100 | Add test vector file CIP-0100 | Add test vector file Mar 14, 2024
@rphair rphair added Update Adds content or significantly reworks an existing proposal and removed Correction Fixing minor issue or typo labels Mar 14, 2024
@rphair
Copy link
Collaborator

rphair commented Mar 14, 2024

@Ryun1 @Quantumplation @kderme I've classified this as an Update rather than just a Correction because of the significant content that is being added as well as fixing whatever inaccuracies may have existed originally.

@Crypto2099
Copy link
Collaborator

Crypto2099 commented Apr 1, 2024

Just a quick note here for @Ryun1 and @Quantumplation but I noticed that the canonical file is not using "permalinks" that reference a specific pull request, which means that the canonicalization may fail and break hashing if the context of a field ever changes between generation, publication, and later validation...

@Quantumplation
Copy link
Contributor

@Crypto2099 the URLs in the context should never change. The URLs are more unique identifiers than anything, and aren't used by computers to pull in any data. They're only meant to uniquely identify and disambiguate between fields. They're URI's for a humans sake, as a convenience, so they can go read about what those fields mean if they want. Even if Github disappeared, and the spec was hosted elsewhere, the URLs should stay the same so that computer consumers continue to know the fields means the same thing.

Thus, it's important to have a stable URL that points at the "latest" version of the spec, in case it's updated with any clarifications / corrections, but also for the spec to not change. Any substantial changes to the meaning of these fields should be a new CIP, which would result in new URLs, to disambiguate.

@Quantumplation
Copy link
Contributor

Quantumplation commented Apr 2, 2024

@Ryun1 I think you should remove example.body.json; this could be confusing, as it implies you create the canonical format through an intermediate json document, but that's not correct (at least according to the current spec). The path is json -> canonical form -> body subset rather than json -> body subset -> canonical form.

For example, it changes all the node identifiers when you do it as you've described in the test vectors. Here's what I get when I canonicalize via the one described in the CIP:

_:c14n2 <https://github.com/cardano-foundation/CIPs/blob/master/CIP-0100/README.md#comment> "This is a test vector for CIP-100"@en-us .
_:c14n2 <https://github.com/cardano-foundation/CIPs/blob/master/CIP-0100/README.md#externalUpdates> _:c14n4 .
_:c14n2 <https://github.com/cardano-foundation/CIPs/blob/master/CIP-0100/README.md#references> _:c14n3 .
_:c14n3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://github.com/cardano-foundation/CIPs/blob/master/CIP-0100/README.md#OtherReference> .
_:c14n3 <https://github.com/cardano-foundation/CIPs/blob/master/CIP-0100/README.md#reference-label> "CIP-100"@en-us .
_:c14n3 <https://github.com/cardano-foundation/CIPs/blob/master/CIP-0100/README.md#reference-uri> "https://github.com/cardano-foundation/CIPs/blob/master/CIP-0100/README.md"@en-us .
_:c14n4 <https://github.com/cardano-foundation/CIPs/blob/master/CIP-0100/README.md#update-title> "Blog"@en-us .
_:c14n4 <https://github.com/cardano-foundation/CIPs/blob/master/CIP-0100/README.md#update-uri> "https://314pool.com"@en-us .
_:c14n5 <https://github.com/cardano-foundation/CIPs/blob/master/CIP-0100/README.md#body> _:c14n2 .

Now, i'm open to changing it to the scheme you described, but it is a change to the spec, so that's probably worth a wider discussion.

@Quantumplation
Copy link
Contributor

In fact, looking into this more, it looks like the canonicalization algorithm is particularly bad for this purpose, in that it depends on the actual content stored in the fields, not just the fields themselves. This is frustrating, as most canonicalization algorithms are content agnostic (see, for example, the CBOR canonicalization, which just specifies map keys are in alphabetical order, use of definite vs indefinite containers, etc.)

So we will need to revise the signing process regardless.

I would suggest something like the following then:

  • start from the json document that everyone wishes to sign
  • Remove from this document any top-level field that is not @context or body
  • NOTE: Extensions to the spec MUST emphasize that anything outside of body is NOT covered by the authors signature. Alternatively, extensions to the spec MUST not add fields outside of body.
  • compute the canonical form of this document
  • Hash it
  • sign it
  • insert the signatures in the appropriate places in the original document

This way is equivalent to what it's described on this branch, but to me is a bit confusingly worded. i.e. it talks about "adding in" the body, and makes reference to cip-0100.common.jsonld... but other documents have have a different context, that's the whole point.

@Ryun1
Copy link
Collaborator Author

Ryun1 commented Apr 3, 2024

it looks like the canonicalization algorithm is particularly bad for this purpose

So we will need to revise the signing process regardless.

ah this is unfortunate

it talks about "adding in" the body, and makes reference to cip-0100.common.jsonld

I will amend

@Ryun1
Copy link
Collaborator Author

Ryun1 commented Apr 3, 2024

@Quantumplation

I have refactored, and fixed wording as pointed out in #782 comment.

Changes

  • Emphasised that the intermediate files are not necessary
  • Renamed from using .canonical to the correct .nq
  • Added in keys to produce a author witness
  • Improved wording

I think this is everything that is needed.

@rphair rphair added the Category: Metadata Proposals belonging to the 'Metadata' category. label Apr 4, 2024
Copy link
Collaborator

@rphair rphair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell (and trusting @Ryun1 the rest of the way) the last round of feedback has been incorporated: though @Quantumplation I would personally prefer to wait on merging this until you can also confirm.

@rphair rphair requested a review from Crypto2099 April 4, 2024 19:37
@Ryun1
Copy link
Collaborator Author

Ryun1 commented Apr 12, 2024

@Quantumplation

We will put this on the CIP editors agenda for Tuesday 18th.

cc @rphair

@rphair rphair added the State: Last Check Review favourable with disputes resolved; staged for merging. label Apr 16, 2024
Copy link
Collaborator

@Crypto2099 Crypto2099 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks in order, good to go

@rphair rphair removed the State: Last Check Review favourable with disputes resolved; staged for merging. label Apr 20, 2024
@Ryun1 Ryun1 merged commit 85cab04 into cardano-foundation:master Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Metadata Proposals belonging to the 'Metadata' category. Update Adds content or significantly reworks an existing proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants