Post restore reset #545

mpalmi · 2023-03-10T15:00:29Z

This PR introduces an interface which acts as a handler for a leaky
abstraction in the structure of underlying log stores. In order to
properly handle post-snapshot-restore cleanup for log stores
generically, we need some awareness of whether the underlying store
permits gaps.

Boltdb allows for gaps in log store indexes, but to handle them it
requires a freelist, which is written on every commit. This is costly,
particularly when the freelist is large. By completely resetting the
LogStore after snapshot restore, we grow the size of the freelist, which would result in performance degradation.

The MonotonicLogStore interface is implemented by LogStores with
guarantees of sequential/monotonic indexes, like raft-wal, but reverts
to the old behavior for for LogStores with index holes, like Boltdb.

The interface also requires special handling within LogStore wrappers (like LogCache), to ensure that the type assertion is passed to the underlying store.

We then use the MonotonicLogStore type assertion to delete all
entries from the LogStore after snapshot restore.

log_cache.go

testing.go

raft.go

raft_test.go

snapshot.go

raft_test.go

This commit introduces an interface which acts as a handler for a leaky abstraction in the structure of underlying log stores. In order to properly handle post-snapshot-restore cleanup for log stores generically, we need some awareness of whether the underlying store permits gaps. Boltdb allows for gaps in log store indices, but to handle them it requires a freelist, which is written on every commit. This is costly, particularly when the freelist is large. By completely resetting the LogStore after snapshot, we grow the size of the freelist, which would result in performance degradation. The MonotonicLogStore interface is implemented by LogStores with guarantees of sequential/monotonic indices, like raft-wal, but reverts to the old behavior for boltdb. This also requires special handling within LogStore wrappers (like LogCache), to ensure that the type assertion is passed to the underlying store.

This commit makes use MonotonicLogStore type assertion to delete all entries from the LogStore after snapshot restore.

banks

@mpalmi Nice. This is also super close, but as it stands it has a huge issue we should fix!

In the last changes we added the log clearing code into the startup restore which is wrong and will delete loose committed data!

Let me know if it's not clear why that is or if I've misunderstood something here - restoring the on-disk snapshot is very different and not at all problematic. We are only trying to fix the problem of restoring from an external snapshot which has the effect of making all the current state invalid and useless!

log.go

raft_test.go

api.go

…onotonicLogStore Co-authored-by: Paul Banks <[email protected]>

…t to ensure logs are not deleted.

jmurret

Great work here by all!

raft_test.go

…tic during user restore.

mpalmi

Looks great! Just a couple of questions that are not showstoppers, but one is related to a comment from @banks, so might be worth addressing.

raft_test.go

…. re-use monotonic cluster options in test.

banks

LGTM. Thanks @jmurret.

I had some minor comment suggestions and a test naming suggestion in line but nothing blocking.

@dhiaayachi do you want to do a final pass at this?

I had this branch running against Consul's restore integration test yesterday so I feel happy we resolved the issues here (in combination with hashicorp/raft-wal#24),

log.go

raft_test.go

dhiaayachi

Great Job All!!
I added a nit but I'm ok with merging this as is.

raft.go

Co-authored-by: Paul Banks <[email protected]>

mpalmi force-pushed the post-restore-reset branch from ec3d577 to 795e781 Compare March 10, 2023 16:04