-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce splitstore memory usage during chain walks #6949
Conversation
Opening as draft until I have tested timing/memory effects in my nodes. |
warmup with on-disk visitor is some 2x slower, takes about an hour now. |
marking is somewhat slower also; but on the bright side, there was no watchdog action whatsoever (which was a usual occurence). |
Marked as ready for review, as node burn was successful. |
Unified the markset/visitor dichotomy, which opens up interesting possibilities: we can now check+mark atomically using Follow up: parallelize |
2aa060a
to
6f59511
Compare
rebasd on master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general logic looks good, but I feel like we can simplify a bit.
return err | ||
}) | ||
|
||
switch err { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could always copy the version, then unlock, then do the rest of this function to simplify the locking a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have to hold the lock while doing the fast checks on the maps.
pend := s.pend | ||
seqno := s.seqno | ||
s.seqno++ | ||
s.writing[seqno] = pend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, with a background worker, we can get rid of this and just have a single "writing" and "pending" set. When the background worker is done flushing the last set, it would check to see if it should flush the next one (e.g., use a channel as a flag).
That should help simplify things a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The disadvantage of the background worker is that we can do only one write at a time -- this can do multiple concurrent writes, which is strictly better i think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also loses the timely error return property.
I guess we could have a worker that spawns goroutines to do writes, and somehow bubble up the error to be checked in the next operation, but this gets more complicated than I'd like it to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think badger will handle write parallelism under the covers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hrm not sure -- I prefer to be explicitly parallel and also keep the ability to return an error at some writer so that we can abort.
On the same time, I don't feel terribly strongly about this and we could use a background worker who spawns writers and bubbles up the error in some way.
Did a number of simplifications:
|
pushed forgotten commit. |
I don't feel too strongly either. I'll take a look at the new code.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, most of this patch looks fine. But the locking is still a bit tricker than I'd like. I can live with everything except taking the lock in one function and dropping it in another. That's really the only blocking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two non-blocking nits.
the walk is BFS, so we can do this!
Also avoid removing the writing set if there was an error while writing.
cbe19e0
to
eb0a62e
Compare
rebased on master for merge, as it was a bit behind. |
Walking the chain was using a
cid.Set
, which can grow quite big (30M+ objects, maybe more) leading to signficant transient memory usage.This adds an ObjectVisitor interface to use instead of a blanket
cid.Set
. This allows us to use a noop set during compaciton (as we already have the markset to lean on, which will be on-disk withMarkSetType = "badger"
) and markset-backed ones during warmup/check.We also reset the walked block set on every iteration, to avoid building up a big one with all the blocks; we can do that because the walk is BFS.