-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CIP 100 | Provide directions on how to create signatures for the body without circular dependencies. #783
Comments
Agreed this isnt clear, I will improve in my PR #782 . But I dont think there is a dependency; |
The canonicalization algorithm depends in some way in sorting the generated N-Quads and also in the number of these N-Quads. Without knowing at least the context of the whole text it's not possible to have placeholders, because it's impossible to know keys that can be sorted before or after the body. I must say my understanding of the algorithm is limited to reach safe conclusions. |
You know all of the keys ahead of time, so you should be able to generate all of the N-Quads. You just don't know the values for those keys. |
So each author is expected to know the total number of authors and other information to predict the correct node-id of the body? It's a bit counter-intuitive that data that live outside the body actually affect the normal form of the body and as a result the signature. |
If you have an alternative suggestion, this is one of the things we struggled with, there's not really a good solution; just taking the |
Sorry for the late response, my suggestion would be to sign the body content as is, or follow the much simpler json canonicalisation algorithm. The json-ld canonicalisation algorithm has no protection over colision attacks, so even after fixing the circular logic problem, it's always possible for 2 different json-ld files to have the same canonical form. For example all 3 texts below have the same canonized format but they're clearly not what the initial signers wanted (note the injected comments). This puts into question the whole mechanism of signature and anchor hash validation and breaks the "humanly readable" property. |
IMO json-ld could be an optional extention for metadata creators that want to follow the rdf/semantic web or tools that want to do rdf style queries and its usage should not be forced, especially for the validation algorithm. |
Usage isn't forced; there's no way the ledger can enforce that. This is a metadata standard, but actual users are free to do whatever they want. |
Right the ledger cannot enforce it, but the CIP-100 specification requires it. |
Yes; to be compliant with the CIP you have to comply with the CIP. But nothing requires that people be compliant with the CIP. It just makes it easier for tooling developers to index the metadata. If the CIP was just "any json document", it wouldn't provide any value to those building tools. On the other end of the spectrum, if it was incredibly prescriptive, it'd lead to lots of churn and argumentation over what should go into the CIP. So it tries to strike a balance for an incredibly nascent and exploratory ecosystem. |
I agree with most parts of cip-100, except the validation algorithms. CIP-1694 defines
CIP-100 instead of a well established hash function with good properties, uses the composition of rdf canonicalisation plus blake2b-256. This not only is not collision resistant, but it allows to construct malicious content with the same hash, as shown at #783 (comment). Imho using a simpler hash directly, like blake2b-256, would increase its adoption. |
@kderme I'm totally fine to change the standard; I think the idea behind the canonicalization was to make it easier to validate; for example, some languages don't preserve the order of keys, newlines, etc. The whole canonicalization debate on the ledger cbor, where you have to carry around the original bytes, has been a nightmare, so we were trying to avoid that. And since this metadata isn't driving any logic (i.e. it's ultimately just for human consumption and display, not powering any ledger decisions etc.) the collision resistance wasn't deemed to be a big deal at the time. That being said, perhaps it is simpler to just say "the raw bytes you receive over the wire are what get hashed, end of story". I'd be totally fine with that change. @Ryun1 @scarmuega @KtorZ Any particular thoughts? |
Also, that's interesting; CIP-1694 shouldn't be defining the content type of the metadata (i.e. it should just say a URL to the metadata payload). Since it can't enforce anything anyway, and we want to leave it flexible to other formats. |
I agree with that.
This has the extra benefit that tools can validate and serve the data without having to parse them.
Since with the above design parsing is not necessary for the full metadata, this won't be an issue. However for the "body" part which needs to be parsed and hashed separately it can be and a canonicalisation algorithm may be a good idea. Imo https://www.rfc-editor.org/rfc/rfc8785 is a better and more established candidate. |
I'd be happy to prepare a pr for CIP-100 with the above. |
@kderme would #835 close this issue? To help this along I'm including it on the CIP agenda tonight (https://hackmd.io/@cip-editors/91), since we're pressed for a decision on that PR and should have related stakeholders present. |
p.s. to #783 (comment) - @kderme now that #835 has been merged, if the remaining issue(s) look any different than you originally described in your OP then please update here so we can stay properly focused. cc @Crypto2099 @disassembler |
IMO a remaining piece for this ticket is replacing URDNA2015 with a simpler canonicalisation algorithm such as https://www.rfc-editor.org/rfc/rfc8785 or https://wiki.laptop.org/go/Canonical_JSON for the canonicalisation of the body, which is signed by the authors (this doesn't apply to CIP-119, only to CIP-108 and possibly future extensions of CIP-100). For the body signing, a canonicalisation algorithm is necessary, however I firmly believe that rfc8785 or Canonical_JSON are much better and simpler candidates. Practically they canonicalise a json structure by removing whitespaces and ordering the keys. They suffer much less by collisions. Speaking as the maintainer of db-sync, we have a full implementation for these CIPs except for the signature validation and it would be a big pain supporting URDNA2015. We probably won't do any signature validations if it remains as is. I'd be happy to open a pr with these chagnes the next days. |
I am linking this post here from the Intersect Discord Server, because we stumbled about it while testing Koios/SPO-Scripts: cardano-signer 1.17.0 now also supports the canonized hashing of the @context+body content for further signing of the document authors. |
Short Update: cardano-signer 1.19.0 can also directly sign jsonld gonvernace metadata files. |
It seems that many tools already support the author canonicalisation as mentioned in CIP-100 and given the improvements made to it, I think this can be close. |
CIP 100 provides an algorithm to find and hash the body of governance metadata
However this assumes that in order to hash the body (last step), you first need to canonicalize the whole document (first step). This creates a circular dependencies problem, since the whole document already contains the signatures.
The text was updated successfully, but these errors were encountered: