-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
db: refactor replayWAL to use flushes to make versionEdits #3955
Conversation
6d9f003
to
52a8137
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: 0 of 17 files reviewed, 4 unresolved discussions (waiting on @itsbilal)
open.go
line 775 at r1 (raw file):
addFileNum(encodedKey) for i := 1; i < int(b.Count()); i++ {
I didn't understand why the logic handles the first key of the batch individually before we enter the loop. Can this start from i := 0
if we remove the `br.Next()`` up above?
open.go
line 803 at r1 (raw file):
} if objMeta.IsRemote() { readable, err = d.objProvider.OpenForReading(context.TODO(), fileTypeTable, n, objstorage.OpenOptions{MustExist: true})
shouldn't we be able to do remove the explicit objProvider.Lookup
call and just unconditionally call OpenForReading
, letting the objProvider
internally call into the vfs provider as necessary?
open.go
line 879 at r1 (raw file):
return } mem.writerUnref()
why do we need to unref here? won't mem
still be sitting in the flushable queue, so it should maintain a reference until it's been flushed? I'm probably missing something obvious
open.go
line 885 at r1 (raw file):
if !d.opts.ReadOnly { entry.flushForced = true }
nit:
entry.flushForced = !d.opts.ReadOnly
Actually, it looks like it wasn't on the 24.2 branch, so it may be an issue with this change. |
52a8137
to
c82638f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the failure was on this branch and it was a race between compactions deleting away the flushable ingested file during Open(), and the obsolete file scan in Open(). I've fixed it now, PTAL!
Reviewable status: 0 of 18 files reviewed, 3 unresolved discussions (waiting on @jbowens)
open.go
line 775 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I didn't understand why the logic handles the first key of the batch individually before we enter the loop. Can this start from
i := 0
if we remove the `br.Next()`` up above?
No reason; copy-paste code from before. Removed.
open.go
line 803 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
shouldn't we be able to do remove the explicit
objProvider.Lookup
call and just unconditionally callOpenForReading
, letting theobjProvider
internally call into the vfs provider as necessary?
Removed. Was copy-pasted from what was there before.
open.go
line 879 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
why do we need to unref here? won't
mem
still be sitting in the flushable queue, so it should maintain a reference until it's been flushed? I'm probably missing something obvious
The writer unref is supposed to get zeroed when we're done writing to the memtable, which is the case here. The memtable starts off with one writer ref on itself (see memTable.init), and one reader ref on the flushableEntry wrapping it. We still keep the reader ref around until we flush the memtable. If we don't writerUnref here, we leak the memtable.
Previously, replayWAL was too tightly coupled with the internals of a flush; it ran a flush directly and would produce a versionEdit that would then be accumulated manually (eg. without a BVE) using assumptions true only for memtable flushes. This required relatively intricate interactions with flushable ingests, and meant that flushable ingested sstables would always go into L0. To pave the way for flushable ingests+excises, we refactor out all flush logic from replayWAL, and have replayWAL just replay WALs into the flushable queue. After this, Open() just schedules flushes until we're done. Informs cockroachdb#2676.
c82638f
to
246ad9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Reviewed 15 of 17 files at r1, all commit messages.
Reviewable status: 15 of 18 files reviewed, 3 unresolved discussions (waiting on @itsbilal)
obsolete_files.go
line 363 at r2 (raw file):
// already been flushed. for _, fEntry := range d.mu.mem.queue { if f, ok := fEntry.flushable.(*ingestedFlushable); ok {
I didn't follow what necessitated passing the flushable ingests into this function as an argument, or was that a stylistic choice?
Also should we leave a TODO to rework this so that ingested flushable sstables receive a ref when they're replayed (and when ingested) and are unref'd after they've been 'flushed' and are now part of a version.
open.go
line 879 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
The writer unref is supposed to get zeroed when we're done writing to the memtable, which is the case here. The memtable starts off with one writer ref on itself (see memTable.init), and one reader ref on the flushableEntry wrapping it. We still keep the reader ref around until we flush the memtable. If we don't writerUnref here, we leak the memtable.
Ah thanks, I wasn't thinking and missed that this was a writer reference.
testdata/checkpoint
line 21 at r2 (raw file):
sync: db open-dir: db sync: db/MANIFEST-000001
this sync is gone because we don't replay anything so we don't flush anything, whereas previously we'd unconditionally commit a version edit? is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: 15 of 18 files reviewed, 3 unresolved discussions (waiting on @jbowens)
obsolete_files.go
line 363 at r2 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I didn't follow what necessitated passing the flushable ingests into this function as an argument, or was that a stylistic choice?
Also should we leave a TODO to rework this so that ingested flushable sstables receive a ref when they're replayed (and when ingested) and are unref'd after they've been 'flushed' and are now part of a version.
The queue should at this point have no flushable ingests in it, because we flushed it away in the maybeScheduleFlush loop earlier in Open()
. This didn't matter back when flushable ingests wouldn't leave the queue until after we had cleaned away obsolete files, but now we need to save the flushable ingests externally so we don't double-delete their files.
The ref you're mentioning in the TODO does get added. The issue is that our obsolete file scan in this method doesn't look at refs, it just looks at files not in the current version and deletes them. It's why it had to be explicitly taught about flushable ingests in the first place.
testdata/checkpoint
line 21 at r2 (raw file):
Previously, jbowens (Jackson Owens) wrote…
this sync is gone because we don't replay anything so we don't flush anything, whereas previously we'd unconditionally commit a version edit? is that right?
that's right. we still create and sync the manifest once, we just don't do it as second time because we wouldn't do the empty versionEdit.
Previously, replayWAL was too tightly coupled with the internals of a flush; it ran a flush directly and would produce a versionEdit that would then be accumulated manually (eg. without a BVE) using assumptions true only for memtable flushes. This required relatively intricate interactions with flushable ingests, and meant that flushable ingested sstables would always go into L0.
To pave the way for flushable ingests+excises, we refactor out all flush logic from replayWAL, and have replayWAL just replay WALs into the flushable queue. After this, Open() just schedules flushes until we're done.
Informs #2676.