-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new transactions #178
new transactions #178
Conversation
There is a hacky way use same public key to generate both Cosmos and Ethereum address, not sure about cometbft |
pkg/crypto/ed25519.go
Outdated
package crypto | ||
|
||
import ( | ||
oasisEd25519 "github.com/oasisprotocol/curve25519-voi/primitives/ed25519" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not 'crypto/ed25519'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to implement the signing algorithm used by CometBFT (they use this under the hood), in order to use validator keys.
We can definitely change the implementation of these, was just what I was toying around with.
pkg/crypto/signature.go
Outdated
type Signature struct { | ||
Signature []byte `json:"signature_bytes"` | ||
Message []byte `json:"message"` | ||
Signature []byte `json:"signature_bytes"` | ||
Sender PublicKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this struct a bit awkward, namely the inclusion of both the message and the pubkey itself.
pkg/transactions.SignedMessage
contains this structure and at a glance appears to contain a copy of the message, although in interface form.
How is this Signature
type contsructed and passed in practice? Same for SignedMessage
. Will open this branch in an IDE to see if it's clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if you were to write out a doc string for this exported type, how would it read in terms of "this combines" and "so that"/"which facilitates" and "should be used by" etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see how the naming and semantics of this are confusing.
The "Message" contained here is actually a hash of the message contained within the transaction. Re-looking at this after stepping away, I agree that the "Message" field could either be renamed hash, or just dropped entirely (since it really is not needed).
The "Signature.Signature" also seems a bit weird too.
I can create a few basic constructors; the creation of them should be pretty easy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looping back on this; I agree that the message should not be contained in the Signature. What about the Sender?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably shouldn't have it here either IMO. For a signature, the norm is to define it as just the signature data by itself (either a byte slice or an r/s bigint pair). Then it has a verify method that accepts a hash/message and a pubkey (otherwise there's a pure function taking all three).
We also need the SignatureType
field (or we define separate concrete types with their own implementations of a Signature interface), but the other stuff is not strictly necessary and more likely to confuse.
You have:
type PublicKey interface {
Bytes() []byte
Type() KeyType
Verify(sign *Signature) error
Address() Address
}
so we can do
func (s *Signature) Verify() error {
return s.Sender.Verify(s)
}
I think you're mocking cometbft here with the PubKey interface, but I'm just not sure it's a great example to follow. Here's what cometbft's does in it's secp256k1.PubKey.VerifyMessage: https://github.com/cometbft/cometbft/blob/68c10d2ad0bb1015a7fb3e1283c50fb2e933fb9a/crypto/secp256k1/secp256k1.go#L190-L216. They literally re-parse the signature and do sig.Verify(...)
. It all feels circular.
It would be great to only have one Verify method we need to use. That is (*Signature).Verify
should just work with the pubkey bytes and the signed data.
The other thing is KeyType
vs SignatureType
. It feels really weird and error prone that these aren't the same. We have two separate secp256k1 signature types, but that's really referring to variations in the DER encoding we're forced to deal with, not the bytes of the r/s components of the successfully-parsed signature, the interpretation of which isn't subject to any variation across implementations. By the time we construct our new Signature
struct instance, we should be shed of the encoding variations, in a canonical format where there's no distinction between signature type and pubkey type.
pkg/types/schema.go
Outdated
|
||
type DropSchema struct { | ||
Owner string | ||
Name string | ||
} | ||
|
||
func (s *DropSchema) Bytes() ([]byte, error) { | ||
return rlp.Encode(s) | ||
} | ||
|
||
func (s *DropSchema) FromBytes(b []byte) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a reasonable pattern for defining a "payload". In my local workspace for the validator module, I have this:
// ExecuteActionPayload is a struct that represents the action execution
type ExecuteActionPayload struct {
Action string `json:"action"`
DBID string `json:"dbid"`
Params []map[string]any `json:"params"`
}
func (e *ExecuteActionPayload) Bytes() ([]byte, error) {
return json.Marshal(e)
}
var _ encoding.BinaryUnmarshaler = (*ExecuteActionPayload)(nil)
var _ encoding.BinaryMarshaler = (*ExecuteActionPayload)(nil)
func (e *ExecuteActionPayload) UnmarshalBinary(b []byte) error {
return json.Unmarshal(b, e)
}
func (e *ExecuteActionPayload) MarshalBinary() ([]byte, error) {
return e.Bytes()
}
In this case I've just chosen the encoding.BinaryUnmarshaler
/encoding.BinaryMarshaler
because it is defined and recognized around the standard library, but both Bytes and FromBytes are common. BTW, the semantics of io.Reader and io.Writer interfaces are totally confusing so I'm delighted you didn't go that way.
EDIT: updated the example code above, pasted wrongish thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in that case I've obviously continued to use JSON under the hood, but obviously the point is its a good idea to pick a generic interface and run with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be a stupid question, but are these lines just meant to ensure that it fulfills the interface?:
var _ encoding.BinaryUnmarshaler = (*ExecuteActionPayload)(nil)
var _ encoding.BinaryMarshaler = (*ExecuteActionPayload)(nil)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah, it's mostly declarative. Expresses to the reader what you're trying to do, and makes a compile error if you mess with the impls.
type Secp256k1Address struct { | ||
address common.Address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what we do for our account addresses, but the address format is somewhat arbitrary. This might encourage us to forget about how users of other chains might interact with Kwil. I think we've talked about this, but I wanted to make a note.
Ideally it would be all pubkey, with a Kwil-defined address format for other uses like queries and accounts store, but I sense we cannot do that easily, or it completely breaks assumptions about how the users and sdks expect it to work. (They do not identify purely by pubkey, mainly for practical reasons surrounding wallet interactions, if I understand right.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, not only do they not identify by pubkey, but they actually expect to be identified by their specific address.
So if some user from Ethereum uses a key, then they will expect that the caller on Kwil (which is exposed to the user in databases, via @caller
) is the same as what they have in their Ethereum wallet.
I made the changes we discussed this morning: Cleaned Up SignatureI'll blame the weird signature structure on the lack of sleep. It's actually exactly what it was before (minus the actual signature algorithms). Separated Transactions and Signed MessagesGavin brought up that we should separate these, and I agree. The one thing to note here is that, within No More TypesThe types got moved into the transactions package. This was suggested by @jchappelow. |
pkg/transactions/transaction.go
Outdated
// TransactionBody is the body of a transaction that gets included in the signature | ||
type TransactionBody struct { | ||
// Payload are the raw bytes of the payload data | ||
Payload rlp.SerializedData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put dbid out of the payload?
Reasons are:
- easy to identify which db this TX for (easy for explorer?) you don't need to look into the payload to know
From
andTo
- db engine only executes payload in different DB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, because DBID is Dataset Module specific. The Validator Module should not need to know or care about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the confusion with "From" and "To" (when comparing with Ethereum) is that in Ethereum, a smart contract is an account, just like a wallet. We don't have that same functionality.
9ade0b3
to
3091e50
Compare
I made the changes discussed above. There are still two outstanding issues:
|
// hash of the transaction that is signed. it is kept here as a cache | ||
hash []byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had same thought. Just have to be cognizant of concurrent access. Slapping on a mutex might be tempting but it's usually best to first understand the access patterns and document the restrictions on the method (not thread safe), unless absolutely necessary.
In the following comment on the Lines 12 to 15 in 2f99564
to what is it referring? I can do some sleuthing, but where's |
I think you may be seeing an old version. The package that contains the transactions named in this commit is |
83b16a4
to
6612784
Compare
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can start to work on those changes to slowly integrate our code
* added a new theoretical structure for transactions, signatures, and types * added drop schema payload * made changes discussed this morning * modified repo to reflect new transaction structure * updated CLI to use stubbed out methods. Should now be building * removed old packages
* added a new theoretical structure for transactions, signatures, and types * added drop schema payload * made changes discussed this morning * modified repo to reflect new transaction structure * updated CLI to use stubbed out methods. Should now be building * removed old packages
* added a new theoretical structure for transactions, signatures, and types * added drop schema payload * made changes discussed this morning * modified repo to reflect new transaction structure * updated CLI to use stubbed out methods. Should now be building * removed old packages
* added a new theoretical structure for transactions, signatures, and types * added drop schema payload * made changes discussed this morning * modified repo to reflect new transaction structure * updated CLI to use stubbed out methods. Should now be building * removed old packages
* added a new theoretical structure for transactions, signatures, and types * added drop schema payload * made changes discussed this morning * modified repo to reflect new transaction structure * updated CLI to use stubbed out methods. Should now be building * removed old packages
* added a new theoretical structure for transactions, signatures, and types * added drop schema payload * made changes discussed this morning * modified repo to reflect new transaction structure * updated CLI to use stubbed out methods. Should now be building * removed old packages
This branch contains a basic restructuring of transactions, signatures, serialization, and payloads. I'd love feedback on this.
This contains quite a bit of duplication with existing code. I did not want to break the build, but I tried to segment it so it would make sense.
Transactions
The new transactions structure takes into account the weird situation we ran into with signed message vs transactions (the old transactions structure was not super flexible). This new structure allows us to contain really any sort of message. It also is designed to not be coupled to the message signer (moved to
Signature
).Furthermore, it also removes the needs for a distinctly defined
PayloadType
(as found inpkg/tx
), in favor of type switching. I believe this is better, but am not sure.This can be found in
pkg/transactions
.Signatures
The new signatures are similar to the old ones, except they should be more flexible for supporting a lot of different signatures from different keys. Essentially, the sender, message, signature, and signature type are all contained within the
Signature
. The message would actually just be the hash of the transaction payload. When it comes time to verify it, the verifier would hash the payload to make sure it matches the data contained inSignature
, and if successful, simply verifies the signature. There is an example of this in the newtransactions
package.I have stubbed out the structure for private keys, public keys, and addresses for different key pairs. I think we could actually just use one address type (instead of secp256k1 and ed25519 having their own), but kept it separate for now.
As a team, it seems we are sort of stuck in between a rock and a hard place:
Because of this, I have left it very general. It would be quite easy to add new signature algorithms in the future while maintaining backwards compatibility using this structure.
This can be found in
pkg/crypto
.Serialization
As previously discussed, we have some major issues with json marshaling. Even forgetting the inefficiencies, the nondeterminism could really be a massive footgun that we don't run into until this thing is running. Furthermore, it's quite hard to provide backwards compatibility with that.
As a workaround, I have used Ethereum's RLP for serialization. There is an extra type appended to the front of all serialized messages, so that if we want to switch in the future / support more types of serialization (such as our own), we can do that with backwards compatibility. Backwards compatibility is my biggest concern here, hence this decision.
There are certainly some tradeoffs with RLP, but there were some strong things going for it:
The drawbacks:
any
. The canonical way of handling this is using strings. Because of SQLite's typing system, this actually is not an issue, but still is not ideal.I have already addressed these drawbacks for our specific case, as addressed in the next section.
This can be found in
pkg/serialize/rlp
, however if we choose to adopt it, it will move intopkg/serialize
.Payloads
The payloads issue is another recurring one. They were in
entity
(we discussed why this was bad), and until now have lived inserialize
(which is absolutely not where they belong). The following was my knee-jerk solution, and the potential downsides are pretty obvious.I have moved the payloads into a
types
package. @jchappelow, I'd love to get your thoughts ontypes
packages; I see them all the time, but most software gurus say they are a bad.This package only holds our payloads (maybe we can rename it to
payloads
), and keeps them separated there as a means of attempting to define a stable public API in one place (they can't be contained in protobuf because they get passed serialized). If we choose to adopt this, we should strictly limit this package to only our public API types, as the drawbacks of types packages with regards to coupling are easy to see.This can be found in
pkg/types
Conclusion
All of the above is implemented in this branch, with the exception of actual signature + verification algorithms. Since signatures are not Kwil specific (which is exactly the problem), I have sent up the bat signal to a few different devs who may be able to help us get it sorted out given the short time frame.
I definitely want to have everyone on board with this; I've made a ton of major suggestions recently, so if anyone thinks something in here is a bad idea, please feel empowered to speak up.
I tried to tie up the loose ends from the rest of the problems in these suggestions, so that hopefully this is the last massive shift we have to do. In addition to the "signatures" problem that we discussed earlier, this also tries to address:
I really tried to thread the needle of these concerns, our timeline, and downstream effects this may have on tooling.
Let me know what you all think!