Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open new commitlogs async #1576

Merged
merged 33 commits into from
Apr 29, 2019
Merged

Open new commitlogs async #1576

merged 33 commits into from
Apr 29, 2019

Conversation

richardartoul
Copy link
Contributor

@richardartoul richardartoul commented Apr 19, 2019

The README.md has more details, but the basic idea with this P.R is that everytime we rotated the commitlog file we would see a really big spike in the commit log queue length. One of the primary causes of the queue backing up was that the single-threaded commitlog writer would stop writing completely while it waited for syscalls and I/O to complete creating a new commitlog file.

This P.R modifies the commitlog to always keep two files open at any time (while only writing to one) and when the commitlog needs to be rotated, the secondary file becomes the primary, and the primary files becomes the secondary and is asynchronously reset in preparation for the next rotation.

Before
Screen Shot 2019-04-18 at 10 09 09 PM

After
Screen Shot 2019-04-18 at 10 09 49 PM

@codecov
Copy link

codecov bot commented Apr 19, 2019

Codecov Report

Merging #1576 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #1576   +/-   ##
======================================
  Coverage      72%     72%           
======================================
  Files         950     950           
  Lines       78950   78950           
======================================
  Hits        56851   56851           
  Misses      18404   18404           
  Partials     3695    3695
Flag Coverage Δ
#aggregator 82.3% <0%> (ø) ⬆️
#cluster 85.7% <0%> (ø) ⬆️
#collector 63.7% <0%> (ø) ⬆️
#dbnode 80.1% <0%> (ø) ⬆️
#m3em 73.2% <0%> (ø) ⬆️
#m3ninx 73.9% <0%> (ø) ⬆️
#m3nsch 51.1% <0%> (ø) ⬆️
#metrics 17.6% <0%> (ø) ⬆️
#msg 74.7% <0%> (ø) ⬆️
#query 66.9% <0%> (ø) ⬆️
#x 86.8% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 91ed20b...fce7dec. Read the comment docs.

@richardartoul richardartoul marked this pull request as ready for review April 22, 2019 19:22
@richardartoul richardartoul changed the title [WIP] - Open new commitlogs async Open new commitlogs async Apr 22, 2019
src/dbnode/persist/fs/commitlog/README.md Outdated Show resolved Hide resolved
src/dbnode/persist/fs/commitlog/commit_log.go Outdated Show resolved Hide resolved
src/dbnode/persist/fs/commitlog/commit_log.go Outdated Show resolved Hide resolved
src/dbnode/persist/fs/commitlog/commit_log.go Outdated Show resolved Hide resolved
src/dbnode/persist/fs/commitlog/commit_log.go Outdated Show resolved Hide resolved
Copy link
Contributor

@Haijuncao Haijuncao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
some minor comment about additional test.

l.flushState.setLastFlushAt(l.nowFn())
// newOnFlushFn is used to create new flushFns because each one needs to know which
// slice of pendingFlushFns it should modify.
func (l *commitLog) newOnFlushFn(writer *asyncResettableWriter) flushFn {
Copy link
Collaborator

@robskillington robskillington Apr 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, why don't you add the onFlush(err error) method to asyncResettableWriter and then make it call the commit log back.

i.e.

type asyncResettableWriter struct {
  // .. other
  commitLog *commitLog
}

func (w *asyncResettableWriter) onFlush(err error) {
  w.commitLog.onFlush(w, err)
}

// now on commit log, retain how this used to work
func (l *commitLog) onFlush(writer *asyncResettableWriter, err error) {
  // as before
}

Then construct more simple to:

l.writerState.primary.writer = l.newCommitLogWriterFn(&l.writerState.primary.onFlush, l.opts)
l.writerState.secondary.writer = l.newCommitLogWriterFn(&l.writerState.secondary.onFlush, l.opts)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought about this and didn't do it because i really hate cyclic datastructures as they can be really confusing. Probably simpler to do it in this case though, I'll make the change

var (
// Determine the persist.CommitLogFile for the not-yet-created secondary file so that the
// ActiveLogs() API returns the correct values even before the asynchronous reset completes.
primaryFile = l.writerState.activeFiles[1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make slightly more sense to access the "last index" than just 1? i.e.

primaryFile = l.writerState.activeFiles[len(l.writerState.activeFiles)-1]

I know it's longer but seems more easy to read (strangely) since I know it's always getting the last entry.

Not too fussed though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the existing one because there can only ever be two and the "last" logic feel more confusing to me because it implies there is may be more than two

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing.

for i, writer := range l.writerState.writers {
if writer == nil {
// Can be nil in the case where the background goroutine spawned in openWriters
// encountered an error trying to re-open it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If you really want to you could create a forEachWriters(func(writer) { ... }) helper function to make sure all for loops does this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that would be a little cleaner, but I'd rather be explicit and have it crash in places where we don't expect to ever be nil than have it silently do something weird.

@richardartoul richardartoul merged commit 120fe60 into master Apr 29, 2019
@richardartoul richardartoul deleted the ra/async-open-cl branch April 29, 2019 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants