-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Engine refactor #353
Engine refactor #353
Conversation
This includes a potential way of handling atomic commits. I need to work on testing it, but in case anyone is curious about what we were discussing in slack, I have something here that I think works. It can be found in Ignore a lot of the other stuff, obviously still a lot to do wrt deleting stuff, tying it all together, and testing. |
230b253
to
d2a0dfb
Compare
At this point things are building. Acceptance tests are still not passing; I sort've know why, but will spend some more time tomorrow checking it out more. Regardless, it is likely ready for review, as anything that changes will be a minor bug fix. There are three main areas that got affected. Obviously the refactor was aimed at the engine, but since the engine really spans the whole system, almost everything got touched.
|
deleteTestDir() | ||
defer deleteTestDir() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these two both intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. If the temp dir is not empty, it will empty it before beginning the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However we can probably use an in-memory connection here
775074e
to
0472f04
Compare
d15dff0
to
62674f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Connection pools are the tricky pieces I think. Some thoughts about that.
Will look through everything else. Can give more concrete suggestions if needed.
internal/sql/registry/registry.go
Outdated
// readerMu is the mutex for readers. | ||
// readers will call it and immediately return. | ||
// `Commit` will call it and return when it is finished. | ||
// This is to prevent readers from being opened while a commit is in progress. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK that this allows commit to begin with existing readers running?
I'm trying to work out the immediate-unlock goals.
Can we document the behavior of the Registry methods that we are seeking?
Apparently: Callers of Query and Get should wait if a Commit is in progress.
But a Commit can start if reads are in progress?
Are we really just trying to make it easier for Commit to acquire the write lock? The RLock docs state:
// RLock locks rw for reading.
//
// It should not be used for recursive read locking; a blocked Lock
// call excludes new readers from acquiring the lock. See the
// documentation on the RWMutex type.
In particular, the "excludes new readers" part. Does that achieve the goal? (RLock
and defer RUnlock
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more minor things in a pending review I had going. Doing more, but wanted to submit this before you got going again today.
internal/sql/sqlite/connection.go
Outdated
c.inUse = false | ||
c.conn.SetInterrupt(nil) | ||
|
||
results.addCloser(deferFunc) | ||
// if c.isReadonly() { | ||
// return prepared.Finalize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like the c.inUse
write is safe. The Result.Mutex
isn't the same as Connection
's.
Can remove this commented code?
internal/sql/sqlite/kv.go
Outdated
} | ||
defer res.Finish() | ||
|
||
var value []byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove this and the ok
decl, and do value, ok :=
below.
internal/sql/pools/querier.go
Outdated
defer writer.Return() | ||
|
||
return writer.Savepoint() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it OK that the returned sql.Savepoint
internally has a connection that has already been "returned"? It's not gonna be closed, but it's not gonna be exclusive anymore either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have had my eyes on 90% of it now. Will circle back after pool/registry is update and I've done some tests.
internal/accounts/accounts.go
Outdated
nonceBytes := make([]byte, 8) | ||
binary.LittleEndian.PutUint64(nonceBytes, uint64(s.Nonce)) | ||
|
||
bts = append(bts, nonceBytes...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiny nit but could do this I think:
nonceBytes := make([]byte, 8) | |
binary.LittleEndian.PutUint64(nonceBytes, uint64(s.Nonce)) | |
bts = append(bts, nonceBytes...) | |
binary.LittleEndian.AppendUint64(bts, uint64(s.Nonce)) |
defer func() { | ||
if err != nil { | ||
err2 := g.datastore.Delete(ctx, schema.DBID()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should only defer
this if Create
succeeds below.
internal/sql/common.go
Outdated
@@ -40,16 +38,68 @@ type Savepoint interface { | |||
|
|||
type Session interface { | |||
Delete() error | |||
GenerateChangeset() (Changeset, error) | |||
GenerateChangeset(ctx context.Context) (Changeset, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type Session interface {
Delete() error
ChangesetID(ctx context.Context) ([]byte, error)
}
Looking at the uses for session, I don't see a reason for a separate Changeset interface
and the two step GenerateChangeset()
-> ID()
process rather than a single ChangesetID()
method. It just needs to be documented that it should only be called once per session (like Delete
).
err = u.engine.DeleteDataset(ctx, dbid, tx.Sender) | ||
if err != nil { | ||
return resp(price), fmt.Errorf("failed to drop dataset: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something we should come back to later is this oddity of returning some meaningful value when error is not nil. The accepted contract with Go functions is that if err is not nil, then the other returns are meaningless and should not be referenced for any purpose. There are probably some exceptions here and there, but this is generally best practice. Likely solution is to add Code uint32; Msg string
fields to ExecutionResponse
, and only return non-nil error before any spending, if at all. Otherwise we'd return an "ErrTxExec
" type for error that can provide the spent gas and code.
Probably should wait until we have gas, and progressive gas consumption that depends on the nature of the engine's actions.
internal/sql/filesystem/fs.go
Outdated
// ForEachFile calls fn for each file in path. | ||
// It passes the file's name (without the path) to fn. | ||
func (f *FS) ForEachFile(path string, fn func(string) error) error { | ||
return filepath.Walk(path, func(path string, info os.FileInfo, err error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, but if we do this for on-demand list datasets, we should probably use filepath.WalkDir
since it is supposed to be more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now only read from the disk during startup and commits (where it needs to be atomic). Calls from the user now read from in-memory.
@jchappelow I have addressed all of the requested changes. Overall, things stayed more or less the same, but there were a few different changes that made things way easier:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just spotted few minor issues; only a couple of packages I actually followed the logic, mostly just general stuff
internal/abci/abci.go
Outdated
@@ -563,6 +572,8 @@ func (a *AbciApp) EndBlock(e abciTypes.RequestEndBlock) abciTypes.ResponseEndBlo | |||
panic(fmt.Sprintf("failed to finalize validator updates: %v", err)) | |||
} | |||
|
|||
a.blockHeight = e.Height |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems not necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is. It needs the block height in Commit
as an idempotency key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've set this in BeginBlock, thus you'll get this field unchanged right?
c2f3776
to
0cba679
Compare
@@ -51,3 +62,62 @@ func (f *defaultFilesystem) Remove(path string) error { | |||
func (f *defaultFilesystem) Rename(oldpath string, newpath string) error { | |||
return os.Rename(oldpath, newpath) | |||
} | |||
|
|||
// the below snippet is from MinIO's https://github.com/minio/minio/blob/38f35463b7fe07fbbe64bb9150d497a755c6206e/cmd/xl-storage.go#L127 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm I don't think we need this exact snippet, since we have more control on the pathName
, and I assume we won't run on window
// Signer returns the public key of the sender of the transaction. | ||
func (s *ScopeContext) Signer() []byte { | ||
return s.execution.Caller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see Signer
used anywhere, and I'm uncertain of it should return s.execution.Signer
instead of Caller
, which is the serialized types.User
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will get shook out once we merge in the new account identifier changes.
Signer and Caller will be the same thing. Right now, public keys own schemas, so we need to identify signers as public keys specifically, while @caller has the additional metadata (key type, address type, etc). These will become the same thing though, once we make the account identifier changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But to clarify the upcoming identifier changes as they pertain to engine, the executionContext.Sender
is removed and caller will be the new "identifier" (address or pubkey) rather than serialization of this types.User
struct?
Is there still any need for public key access in kuneiform or is everything just reduced to one @caller
and no more address()
or @caller_address
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Essentially, signer is currently used for deploying databases, dropping databases, and the owner
modifier, while @caller
is the serialized user type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But to clarify the upcoming identifier changes as they pertain to engine, the executionContext.Sender is removed and caller will be the new "identifier" (address or pubkey) rather than serialization of this types.User struct?
Is there still any need for public key access in kuneiform or is everything just reduced to one @caller and no more address() or @caller_address?
Correct. There will be no more need for public key in Kuneiform; it will all just be the user identifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 🥳
Would you like to write a message for the squashed commit? Normally I'd take it upon myself to glean from PR description and/or comments plus the messages of the commits being squashed, but there this one has evolved a lot. Even just a couple sentences would be enough IMO, but I thought you might want to tweak what's in this comment after the most recent changes: #353 (comment)
Yep will do that now |
…features and improvements. Among these are: - Proper handling of transactionality for database deployments and drops - Simpler generation of app hashes - Simpler code for determining action logic, making future improvements easier - Replacing the 2pc protocol with a much simpler system based on idempotency - Better concurrency support for databases - Better execution lifetime handling for long running queries (to prevent writer starvation) - An overall simplification of the entire Kwil system, reducing on net ~5000 lines of code. - And other improvements
e3d75e5
to
37808e3
Compare
…features and improvements. (#353) Among these are: - Proper handling of transactionality for database deployments and drops - Simpler generation of app hashes - Simpler code for determining action logic, making future improvements easier - Replacing the 2pc protocol with a much simpler system based on idempotency - Better concurrency support for databases - Better execution lifetime handling for long running queries (to prevent writer starvation) - An overall simplification of the entire Kwil system, reducing on net ~5000 lines of code. - And other improvements
…features and improvements. (#353) Among these are: - Proper handling of transactionality for database deployments and drops - Simpler generation of app hashes - Simpler code for determining action logic, making future improvements easier - Replacing the 2pc protocol with a much simpler system based on idempotency - Better concurrency support for databases - Better execution lifetime handling for long running queries (to prevent writer starvation) - An overall simplification of the entire Kwil system, reducing on net ~5000 lines of code. - And other improvements
…features and improvements. (#353) Among these are: - Proper handling of transactionality for database deployments and drops - Simpler generation of app hashes - Simpler code for determining action logic, making future improvements easier - Replacing the 2pc protocol with a much simpler system based on idempotency - Better concurrency support for databases - Better execution lifetime handling for long running queries (to prevent writer starvation) - An overall simplification of the entire Kwil system, reducing on net ~5000 lines of code. - And other improvements
…features and improvements. (#353) Among these are: - Proper handling of transactionality for database deployments and drops - Simpler generation of app hashes - Simpler code for determining action logic, making future improvements easier - Replacing the 2pc protocol with a much simpler system based on idempotency - Better concurrency support for databases - Better execution lifetime handling for long running queries (to prevent writer starvation) - An overall simplification of the entire Kwil system, reducing on net ~5000 lines of code. - And other improvements
…features and improvements. (#353) Among these are: - Proper handling of transactionality for database deployments and drops - Simpler generation of app hashes - Simpler code for determining action logic, making future improvements easier - Replacing the 2pc protocol with a much simpler system based on idempotency - Better concurrency support for databases - Better execution lifetime handling for long running queries (to prevent writer starvation) - An overall simplification of the entire Kwil system, reducing on net ~5000 lines of code. - And other improvements
…features and improvements. (#353) Among these are: - Proper handling of transactionality for database deployments and drops - Simpler generation of app hashes - Simpler code for determining action logic, making future improvements easier - Replacing the 2pc protocol with a much simpler system based on idempotency - Better concurrency support for databases - Better execution lifetime handling for long running queries (to prevent writer starvation) - An overall simplification of the entire Kwil system, reducing on net ~5000 lines of code. - And other improvements
This is still very much WIP, but I wanted to push this up in case anyone was interested. Don't be intimidated by the current amount of "new code"; I have copied a lot of stuff to be able to modify it freely or make the imports sensible.
Things like the SQL analyzer, the dataset DB (which is a SQL db that understands schema metadata), and types are all directly copied over, and mostly unchanged.
What has been done so far
2pc (
internal/engine2/commit
)So far, the bulk of the work has been in solving the issue of database deployments and drops not conforming to the 2pc protocol. I've had 4 different attempts on this, but I think the current one is in the right direction.
Very naively, I have identified what operations in Kwil are "2-phase-commit-able". Right now, this is only deploy, drop, and execute DML, however this will expand to include migrations. This layer is responsible for handling making these conform to Kwil's 2pc protocol.
The main impetus for creating this as a separate layer is that it allows us to build anything else that we want within the engine without having to worry about the 2pc protocol, which is nice.
There are still two things I do not like about this package:
1. Databases cannot be deployed and dropped in the same session.
There isn't really a good reason for this, except that there are a lot of edge cases that make it hard to implement. For the sake of time, I figured this is an ok tradeoff to have for an initial unreleased version, however I would like to get it fixed before we actually use this.
2. The code is pretty complex
While I really like the interfaces used by this package, the code itself is fairly complex. In particular, it feels weird that each operation has completely different code depending on its phase.
execution (
internal/engine2/execution
)This is still very much more a WIP than the commit layer, but it is at least a sensible first attempt for executing logic. It is sort've similar to the
internal/engine/execution
package, except it is a LOT easier to add new functionality.There are still a lot of unanswered questions regarding where action parsing should take place, where we should handle argument evaluation, and where + how we should handle "modifiers" (I have some thoughts on this that I am testing out).
A few things that this does differently than its predecessor:
1. Namespaces
We sort've have a concept of "namespaces" evolving within Kwil. Right now, it is only possible to create a namespace with an extension, However, we have talked quite a bit about having other databases be callable if they are on the same node. Furthermore, we have been talking a bit about how we can give the user control of some sort of token functionality (essentially, allowing them to modify the account store).
While we aren't implementing these features here, it is quite easy to see how we could create an account store "namespace" (or any other type, for that matter) that could be included as part of Kuneiform's "standard library", and simply make that accessible here.
There are still some nuances I need to work out for determining which namespaces are and are not accessible, but the general idea is there.
2. DDL
While I am still going back and forth on whether this should be done this way, this package also handles database deployments and drops. It treats them like any other operation. I did this as more of an experiment, with database migrations in mind; if we do want to allow some sort of programmability / logic for migrations, we might want to handle that here. Conceptually, those are the same as DDL statements, and so I threw those in here.
I'm leaning towards taking these out, but haven't decided yet.
Things that still need to be done
Higher Level Engine Package
This does not yet include a higher level package that ties all of these together for a consumer. I think a mistake I made with the old engine was that I started high level, and progressed downwards. This created a really non-sensical structure, so I tried to reverse that here
internal/sql
Our SQLite clients are really a mess, and should be cleaned up. There should really only be 1 client, and there are some changes that should be made to allow for better testing, as well as having a more sensible interface.
I have copied the old package to allow me to hack around it for testing new things, but the
internal/sql2
package currently does not mean anything, and is still yet to be totally overhauled.Conclusion
This is still very much an active construction site, and thus a lot of work to do, but I think I am on the right track with things. The hardest problem (2pc) has a decent solution, as well as the second hardest problem (execution) has a plausible framework for how it can be solved.
Overall, I'm confident that this will be a much better structure with which we can delineate responsibilities, allowing us to more easily add features to our engine in the future. I am also confident that it will be quite easy to plug the end result of this into our current system with minimal friction.