Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First Version #6

Merged
merged 81 commits into from
Apr 6, 2023
Merged

First Version #6

merged 81 commits into from
Apr 6, 2023

Conversation

expede
Copy link
Collaborator

@expede expede commented Dec 1, 2022

@cla-bot cla-bot bot added the cla-signed label Dec 1, 2022
README.md Show resolved Hide resolved
@expede expede marked this pull request as ready for review December 2, 2022 07:08
@expede expede requested a review from oed as a code owner December 2, 2022 07:08
@expede
Copy link
Collaborator Author

expede commented Dec 2, 2022

This went through way more iterations than I expected going in. I should have known better: cryptography is super subtle!

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@oed
Copy link
Collaborator

oed commented Dec 2, 2022

Let me try to throw a quick few examples on this:

SIWE

Let's start by assuming we have landed on an IPLD format (e.g. CACAO / ucan-ipld) for SIWE.

Signature params:

  • keyPrefix - this would be a secp256k1 key, 0xe7
  • hashPrefix - this needs to be keccak256, 0x1b
  • digestLength - 256 bits, 32 bytes, ?? double check this
  • contentEnc - we would need to register a new eip4361-eip191 codec

The eip4361-eip191 codec takes an IPLD object and tries to convert it to a SIWE string prefixed using eip191.

DagJWS

OK, so there is already infrastructure built to produce DagJWS compliant signatures. Here's how we could make them compatible with Varsig

Signature params:

Plain JWS

Maybe this is easier?

Signature params:

Note that the protected header is automatically generated. This means that we can't sign a JWT with this alg because it needs to include a "typ": "JWT" field in the header.

README.md Outdated
Comment on lines 309 to 310
es256k-varsig-header = 0xe7 ; secp256k1 multicodec prefix
sig-bytes = 64(OCTET)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to specify the hash alg here?
For secp256k1 specifically, JOSE uses sha256, while ethereum wallets produce signatures with keccak256.

The same might be needed for other ECDSA algs as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other ECDSAs do make sense; we can add the hash parameter here. Do Ethereum wallets use e.g. EIP-191, or do they just sign the data directly?

Which opens the question about how we want to express things like EIP-191. Here's two options:

Per the 191 spec, I think that we need to support all 3 varints under EIP-191... otherwise we can break them out into their own codes.

eip-191-varsig-header = 0xd191 ; EIP-191
eip-191-version = unsigned-varint
sig-bytes = 64(OCTET)

We could also nest the Ethereum message prefix from the signature type:

eip-191-signature = eip-191-header nested-varsig
eip-191-varsig-header = 0xd191 ; EIP-191
eip-191-version = unsigned-varint
nested-varsig = varsig

Thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with the first one, but can change it if needed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also updated it to 0xe191. In multicodec we had used 0xd191 we wanted to namespace all signatures into 0xd_.

Copy link
Collaborator

@oed oed Mar 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually raises a different question for me. If I have an IPLD struct, what do I hash?

Suppose I encounter an object with the following schema,

type Data struct {
  field1 Bytes
  field2 String
  sig Varsig
}

and the varsig is varint(0x34) varint(0xed) sig-bytes, e.g. an ED25519 signature, I know I should use sha256, but over what?

Copy link
Collaborator Author

@expede expede Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a call with @Gozala (about some other stuff) and we chatted about the above a bit.

Irakli's Preferred Version

Basically a linear array of nested codecs

;; Irakli's Preference
varsig = multibase-prefix %x34 varsig-header varsig-body

;; Separate spec
content-info = 0xFF encoding-header encoding-body

Here, the signature bytes would be nested inside the content-encoding payload. For example:

rsa-varsig = rsa-varsig-header rsa-hash-algorithm signature-byte-length content-info

;; For example, using brackets to show grouping
0x34 (0x1205 0x12 0x0100 (0xFF 0x71 signature-bytes))
     ^                   ^
     |                   |
     |          nested-content-encoding    
     |
  rsa-sig

The reason for this style is that you can endlessly nest these, and each segment acts as its own multiformat with its own number of segments. The downside is that if you want to update a segmant, you need to change some deeply nested structure.

Wrapped

Here's basically the opposite version which works by adding info as needed at the top level (which could lead to a lot of varsig types but keeps the "core" signature info together with the bytes)

0x34 (0xFF 0x71 (0x1205 0x12 0x0100 signature-bytes))
     ^          ^
     |          |
     |       rsa-sig
     |
 wrapped-content-encoding

Tree Structured

Arguably, we know that at minimum we want the encoding, so including at the top level rather than pushing it into every variation:

varsig = multibase-prefix %x34 encoding-info varsig-header varsig-body sig-bytes
0x34 (0x71) (0x1205 0x12 0x0100) (signature-bytes)
     ^      ^                    ^
     |      |                    |
     |   rsa-sig            raw-rsa-signature
     |
 wrapped-content-encoding

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My argument had been that sig-bytes should use their own multiformat to describe how to derive bytes from the source data. If not specified it is raw bytes.

How do I go from the IPLD object to raw bytes if it's not specified? I think storing the encoding inside of the sig-bytes seems fine, but the examples in the spec needs to demonstrate this if so.

@expede Thanks for the examples. I'm kind of fine either way as long as we specify the encoding.

I would also like to know if you see any specific use case that doesn't require encoding @Gozala ?

Finally a question around the content encoding itself. Assume we have the following object IPLD object:

type Foo struct {
  bar String
  sig Varsig
}

My assumption would be that we remove the sig field (or whatever field contained the varisg, then do the encoding? Is this other peoples assumption as well, or is there a different approach?

Copy link
Collaborator Author

@expede expede Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this other peoples assumption as well, or is there a different approach?

This is also my assumpition, and makes my life easier in a few places (though it's not something that's a hard requirement). It's not fully clear how you would sign over just the bar string (but not the key) — I would assume that this is by using the 0x55 raw data encoding, or introduce a "forget about the keys just the values stringjoined" encoding prefix. Alternately pushing this decision into the schema layer frees up this choice (like the extended IPLD subtyping example earlier in this thread).

If this default behaviour sounds preferable to you two, I'll add a section about this to the spec.

Copy link
Collaborator Author

@expede expede Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@expede Thanks for the examples. I'm kind of fine either way as long as we specify the encoding.

Okay let's go with the one I labelled "Irakli's preference" then, since neither of us have super string feelings against 👍

...updating...

Copy link
Collaborator Author

@expede expede Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep things moving, I went ahead and updated the spec with encoding-info. It was very awkawrd to expresss when encoding-info included the signature bytes, so I broke out the signature into its own segment — it seems to have worked out 👍

I embedded the encoding table right in this spec; we can always break this out at a later date. I also changed EIP-191 to be a secp256k1 signature over data encoded as EIP-191. @oed it's been a while since I worked with Ethereum signatures, can you confirm that I haven't missed any nuances and that this appraoch makes sense to you?

README.md Outdated
Comment on lines 247 to 261
encoding-info
= %x55 ; Raw bytes multicodec prefix
/ %x70 ; DAG-PB multicodec prefix
/ %x71 ; DAG-CBOR multicodec prefix
/ %x0129 ; DAG-JSON multicodec prefix
/ %x6A77 ; JWT
/ %xE191 eip-191-body encoding-info ; EIP-191

eip-191-body
= %x00 eth-validator-address
/ %x01
/ %x45 message-byte-length ; versions & payloads

eth-validator-address = 20(OCTET)
message-byte-length = unsigned-varint
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably point you directly at it @oed 😉

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should include the ethereum address that made the signature here. Will be redundant in many cases. The only version of eip191 used in practice is 0x45 so I would suggest that's what we use.

I think the encoding info for a eip191 msg should look like this:
encoding-info = varint(0xe191) varint(codec)
where codec could be dag-cbor, dag-json, or something custom, etc.

Copy link
Collaborator Author

@expede expede Apr 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only version of eip191 used in practice is 0x45 so I would suggest that's what we use.

Any objection to keeping the version in there, just dropping the other two?

encoding-info
  = %x70   ; DAG-PB multicodec prefix
  ;; ...
  / %xE191 %x45 message-byte-length encoding-info ; EIP-191 version 0x45
;           ^
;           |
;      Keeping this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you're absolutley right that we probably won't see 0x00, and 0x01 is also known as EIP-712, so it could have its own row. Okay that makes sense, I'll remove the %x45 👍

@expede expede mentioned this pull request Mar 30, 2023
Copy link
Collaborator

@oed oed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this seems great! Added some nitpicks.

README.md Outdated

``` abnf
encoding-info
= %x55 ; Raw bytes multicodec prefix
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raw doens't make sense to me here because afaik there's no way to go from an IPLD object to Raw. By definition raw doesn't have any encode/decode functions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that it doesn't have an encode/decode function. I would assume that we should have some ability to sign over a bare sequence of bytes outside of an object, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess e.g. DAG-CBOR does have raw bytes, so that may be sufficient. Okay, removing!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually scratch that. Even CBOR has a length header on binary. This inclusion makes it impossible express signatrues that are not diretcly in IPLD and need to interoperate with other systems.

How would you feel if 0x55 was a special case for when there was a single sibling, which unwraps it and passes it as raw bytes to the signature?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be useful, but I'm not sure RAW is the right way to express this? I don't feel super strongly though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay sure, we can pick another tag. How about 0x5F? It's _ in ASCII, and there's no namespace collision the multicodec table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

README.md Outdated
/ %x71 ; DAG-CBOR multicodec prefix
/ %x0129 ; DAG-JSON multicodec prefix
/ %x6A77 ; JWT
/ %xE191 eip-191-body encoding-info ; EIP-191
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, eip191 only works over a string. It would need to be paired with an IPLD codec, e.g. dag-cbor, to get the byte string to sign over first. For CACAO specifically we would want to define a new IPLD codec that takes a CACAO object and outputs an ascii SIWE string.

Copy link
Collaborator Author

@expede expede Apr 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my intention with the recursive encoding-info. For example:

;    Ed25519     256-bytes
;      |            |       
;      v            v              
%x34 (%xed (%xE191 %x0100 %x71) <sig-bytes>)
;            ^             ^
;            |             |
;         EIP-191       DAG-CBOR

...but if down the road there another encoding-info that has more fields that is nestable under EIP-191, then it could also be multi-segment.

Does that make sense?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only works over a string

Also to clarify: you mean a bytestring, not an ASCII/UTF8 string, correct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Technically the personal_sign eth-rpc method accepts both utf8 and hex encoded byte strings. But ofc it's encoded as a bytestring before it gets prepended and hashed.

README.md Outdated
Comment on lines 247 to 261
encoding-info
= %x55 ; Raw bytes multicodec prefix
/ %x70 ; DAG-PB multicodec prefix
/ %x71 ; DAG-CBOR multicodec prefix
/ %x0129 ; DAG-JSON multicodec prefix
/ %x6A77 ; JWT
/ %xE191 eip-191-body encoding-info ; EIP-191

eip-191-body
= %x00 eth-validator-address
/ %x01
/ %x45 message-byte-length ; versions & payloads

eth-validator-address = 20(OCTET)
message-byte-length = unsigned-varint
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should include the ethereum address that made the signature here. Will be redundant in many cases. The only version of eip191 used in practice is 0x45 so I would suggest that's what we use.

I think the encoding info for a eip191 msg should look like this:
encoding-info = varint(0xe191) varint(codec)
where codec could be dag-cbor, dag-json, or something custom, etc.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
/ %x70 ; DAG-PB multicodec prefix
/ %x71 ; DAG-CBOR multicodec prefix
/ %x0129 ; DAG-JSON multicodec prefix
/ %x6A77 ; JWT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: a learning from writing the CACAOv3 spec is that a pure JWT codec wouldn't be enough for UCANs.

Both of these doesn't really make sense for a JWT only codec:

  1. DIDs need to be converted to binary
  2. The v field needs to be set to ucv in the JWT header

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting; can you tell me more? Is that to normalize the fields across CACAO types, or so that you know which fields to hoist from a flat structure into the header, or something else?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that you know which fields to hoist from a flat structure into the header

This primarily.

As for the DIDs, in UCAN they are strings, for CACAO they need to be multidid encoded.

Copy link
Collaborator Author

@expede expede Apr 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense. Thinking about this, we're liekly going to need a bunch of these for a bunch of cases. IIRC a CACAO itself is encoding agnostic by design. How would you feel about something like this?

encoding-info
  = %x55
  / %x70
  / %x71
  / %x0129 
  / %x6A77
  / %xCAC0 cacao-encoding ; NEW!

cacao-encoding
  = %x3C ; ReCap
  / %x51 ; SIWE
  / %xCA ; UCAN

Alternately we could put UCAN etc right in the table, but that may cause confusion between (e.g.) CACAO and ucan-ipld

Aside

I also vaguely remember discussing there being a difference between the CACAO version and the version of the token that it maps to, but I don't see that in the latest version of ChainAgnostic/CAIPs#196 — maybe not required, but I thought I'd check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to add CACAO specific encoding-info to this spec. These can be defined in the CACAOv3 CAIP only I think?

Old versions of CACAO doesn't use varsig, so doesn't really make sense to specify codes for them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, okay. Let's say for example that you cited where UCAN has ucv in the header. ucv is not a standard JWT field, but we do need to sign over the JWT-serialized format. Would the JWT serializer know that a CACAO v should be serialized as ucv in the header (I mean, we can also move this field to the claims/body, but it is basically header info since it describes the shape of the body).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly, the JWT serializer would need to know this. Moving the field I think could be a good option unless we want to specify a custom UCAN codec.

README.md Outdated

``` abnf
encoding-info
= %x55 ; Raw bytes multicodec prefix
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be useful, but I'm not sure RAW is the right way to express this? I don't feel super strongly though.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Co-authored-by: Joel Thorstensson <[email protected]>
Copy link
Collaborator

@oed oed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm!

Final nitpick but not a blocker.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Copy link

@QuinnWilton QuinnWilton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Two nitpicks, but no blockers :)

@bumblefudge
Copy link
Collaborator

Should I override this pending request?
image

@expede expede merged commit 27dbb9f into main Apr 6, 2023
@expede expede deleted the first-draft branch April 6, 2023 16:46
@expede
Copy link
Collaborator Author

expede commented Apr 6, 2023

@bumblefudge evidently I also have the power 💪

We had chatted with Irakli about the final changes in some length, so I'm sure it's fine (in order to keep stuff moving forward). If not, we can cut a 0.2 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants