-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JUJU-1950] Use the new lease store in the lease manager #15002
Conversation
moving to Dqlite-backed leases.
store constructor. The FSM and sundries are no longer required and are removed along with all Raft concerns from the manifold declarations.
manifold tests for new inputs.
no longer used, and has been removed.
retrieve the controller DB, which is used by the lease store.
in-memory SQLite. This is used by the lease store tests, which are also fixed for the corrected lease namespaces.
that it runs on all controllers.
|
||
// IsErrRetryable returns true if the given error might be | ||
// transient and the interaction can be safely retried. | ||
func IsErrRetryable(err error) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lifted this straight from LXD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really feels like it should be apart of the dqlite go library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think @MathieuBordere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that go-dqlite
is probably better suited to determine this, tracking it here .
for the purposes of the removed Raft lease client. The "dropped" error is removed, as we no longer emit it anywhere.
6bf541d
to
6b2e8f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ControllerDDL
should be a string, as you can create multiple tables in one exec. You never want to have a partial controller, with partially applied data.
tx, err := s.DB.Begin() | ||
c.Assert(err, jc.ErrorIsNil) | ||
|
||
for _, stmt := range schema.ControllerDDL() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to use the migration Apply
here, I don't think it's expensive to run, and it would exercise that code path more.
return nil, errors.Trace(err) | ||
} | ||
|
||
db, err := dbGetter.GetDB("controller") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: constant "controller"
q := ` | ||
DELETE FROM lease WHERE uuid in ( | ||
SELECT l.uuid | ||
FROM lease l LEFT JOIN lease_pin p ON l.uuid = p.lease_uuid | ||
WHERE p.uuid IS NULL | ||
AND l.expiry < datetime('now') | ||
)`[1:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like the wrong location to have SQL. I'm not sure where we would want to move this or even call this from. For now, whilst we're moving to the new DB I think it's fine to leave this here until we see more emergent patterns
the error string. This is intended as a temporary measure which we work out how to ensure detection with Dqlite codes.
#15177 The following brings the 3.0-dqlite feature branch into the develop branch. ### Changes This brings in the dqlite database to sit along side the mongo database. Currently, only leases are implemented in Juju using dqlite, more controller base configuration and data will be subsequently moved over to dqlite once this branch has landed. #### Leases/Raft The whole raft implementation has been removed from Juju completely. This includes the following workers: - raft backstop - raft clusterer - raft log - raft transport - global clock updater In addition, the raft API implementation has also been removed. Instead, the lease has changed to handle the store (dqlite db) directly, improving readability and complexity. ### Jujud The `jujud` agent is now built using musl (specifically musl-gcc). This allows `juju` to be built statically embedding `dqlite` in the same process. There are still some rough edges when building and testing and when this lands, we expect to see some churn to polish any of those issues. Using `go test` is expected to still work as is, this is a last-minute change so that we can utilize sqlite directly for local tests. If you require to test with dqlite (linux only), then running `-tags="dqlite"` with builds/tests/installs is required. All CI jobs are required to run with the dqlite tag. Some notes: 1. `CGO_ENABLED=1` and `CGO_LDFLAGS_ALLOW="(-Wl,-wrap,pthread_create)|(-Wl,-z,now)"` are required if you're using dqlite directly. 2. You are expected to install musl directly on your system if you want to build, using `make musl-install`. This will require sudo. 3. For development purposes we will download dqlite `.a` files from an s3 bucket to facilitate the setup process. The tar file is sha256 summed to ensure no MITM. You can build these locally if you want to bypass s3 using `make dqlite-build-lxd`. This will spin up an lxd container to build. **Do not attempt** to run `make dqlite-build` locally, unless you know what you're doing. 4. To access dqlite from a controller, use `make repl`, this will open up a pseudo repl when you can then explore the database with. `.open <db name>` and then you can use SQL from there. 5. Cross compilation to other architectures can be done using `GOARCH` and `GOOS` before `make install` or `make build`. There are probably some things I've forgotten, expect a discourse post soon, which will highlight the development flow. ---- Two conflicts when merging. The resolution was to bring in the secret backends for the manifold tests and the controller config type changed for `DefaultMigrationMinionWaitMax`. ``` CONFLICT (content): Merge conflict in cmd/jujud/agent/machine/manifolds_test.go CONFLICT (content): Merge conflict in controller/config.go ``` c141b2e (upstream/3.0-dqlite) Merge pull request #15159 from SimonRichardson/system-install-musl-by-default 83656e2 Merge pull request #15156 from SimonRichardson/change-log-ddl 125c19d Fix static-analysis pipeline (#15168) 5abfa24 Merge pull request #15140 from SimonRichardson/allow-testing-on-mac 1dc60f6 (3.0-dqlite) Merge pull request #15153 from SimonRichardson/content-addressable-deps 5a1cd24 Merge pull request #15150 from jack-w-shaw/JUJU-2615_symlink_sudo 4502d63 Merge pull request #15148 from SimonRichardson/better-install-method 88941dd Merge pull request #15134 from SimonRichardson/bootstrap-dqlite-agent-tests 2551ffc Merge pull request #15130 from SimonRichardson/build-jujud-snap 0180a53 (origin/3.0-dqlite, manadart/3.0-dqlite) Merge pull request #15123 from SimonRichardson/fix-manifold-lease-expiry-tests fdf9cc7 Merge pull request #15115 from SimonRichardson/remove-jujud-main-test-file bf58843 Merge pull request #15113 from SimonRichardson/remove-api-raftlease-api-client f9419c0 Merge pull request #15112 from SimonRichardson/fix-initializing-state-twice 334d557 Merge pull request #15108 from SimonRichardson/github-action-go-build 2ee6e1a Merge pull request #15107 from SimonRichardson/cross-building-jujud 5a93305 Merge pull request #15087 from SimonRichardson/ensure-placement-of-file da95dc0 Merge pull request #15086 from SimonRichardson/more-sudo-changes 7b86376 Merge pull request #15085 from SimonRichardson/sudo-apt-get c4d4eb6 Merge pull request #15057 from SimonRichardson/dqlite-local-build 0ac79b3 Merge pull request #15061 from manadart/develop-into-3.0-dqlite adc20f7 Merge pull request #15043 from SimonRichardson/allow-overriding-arch-machine 8c02f22 Merge pull request #15048 from SimonRichardson/static-analysis-fix 4547c06 Merge pull request #15050 from manadart/dqlite-address-option d51b324 Merge pull request #15049 from manadart/dqlite-bootstrap-options 3801b78 Merge pull request #15047 from manadart/develop-into-3.0-dqlite 22d5247 Merge pull request #15037 from SimonRichardson/standardise-dqlite-build 25640a2 Merge pull request #15036 from SimonRichardson/remove-batch-fsm-controller-config dfa4cb1 Merge pull request #15041 from manadart/dqlite-fix-mock caf9481 Merge pull request #15034 from manadart/develop-into-3.0-dqlite c91985d Merge pull request #15035 from SimonRichardson/remove-typed-lease-error 42d17be Merge pull request #15009 from SimonRichardson/allow-repl-via-juju-ssh d798238 Merge pull request #15002 from manadart/dqlite-use-lease-store e4f0d39 Merge pull request #14918 from manadart/3.0-dqlite-lease-store 8315fb7 Merge pull request #14986 from manadart/dqlite-build-from-tags a73b947 Merge pull request #14927 from manadart/3.0-dqlite-lease-store-interface 1657a1d Merge pull request #14910 from manadart/3.0-dqlite-db-supply 27b23f3 Merge pull request #14909 from manadart/3.0-into-3.0-dqlite 6adff35 Merge pull request #14756 from manadart/develop-into-3.0-dqlite 37d81ff Merge pull request #14717 from manadart/develop-into-3.0-dqlite fe2edb8 Merge pull request #14671 from manadart/3.0-simplify-dbaccessor 1a09836 Merge pull request #14604 from manadart/3.0-bootstrap-controller-db 5ad011e Merge pull request #14652 from manadart/develop-into-3.0-dqlite 1c3d250 Merge pull request #14591 from manadart/develop-into-3.0-dqlite 229cd3e Merge pull request #14578 from manadart/3.0-dqlite-simplify 9d715ba Merge pull request #14565 from manadart/develop-into-3.0-dqlite 92ffd88 Merge pull request #14466 from manadart/develop-into-3.0-dqlite 57f67ce Merge pull request #14336 from manadart/develop-into-3.0-dqlite 648d354 Merge pull request #14364 from manadart/update-musl 198621d Merge pull request #14241 from manadart/develop-into-3.0-dqlite 0360db6 Merge pull request #14153 from manadart/develop-into-3.0-dqlite 17950b2 Merge pull request #14053 from manadart/develop-into-3.0-dqlite 9452026 Merge pull request #14016 from manadart/develop-into-3.0-dqlite 741baca Merge pull request #13963 from manadart/develop-into-3.0-dqlite 5449603 Merge pull request #13969 from manadart/dqlite-manifolds 7b612a0 Merge pull request #13944 from SimonRichardson/dqlite-develop
Under #14918 we added a new implementation of the lease store indirection, backed by a relational database.
Here we change the dependency graph so that the
db-accessor
worker becomes a dependency of the lease manager, and is used to create the new store for lease state.A new
db-expiry
worker is added that periodically deletes leases that have passed their expiry time. It replaces the old global clock updater worker that was used to "tick" the clock inside the Raft FSM, triggering expired lease deletion.Testing concerns are aided by a new base suite that provides an in-memory SQLite database primed with the controller schema.
Almost all Raft concerns are deleted now that they are no longer required. Some logic remains in
core/raftlease
, and is kept as a temporary reference for metrics while we decide what we need to capture with the new method - many of the prior metrics actually informed the performance of the pub/sub-based lease client, which is now gone.Scale testing will inform refinements around retry strategies in the future.
QA steps
juju.worker.lease=TRACE;juju.worker.leaseexpiry=DEBUG
Documentation changes
Probably not official docs changes, but there are some play-books around for what to do when Raft leases are at sea.
Bug reference
N/A