Reduce splitstore memory usage during chain walks #6949

vyzo · 2021-07-30T08:01:05Z

Walking the chain was using a cid.Set, which can grow quite big (30M+ objects, maybe more) leading to signficant transient memory usage.

This adds an ObjectVisitor interface to use instead of a blanket cid.Set. This allows us to use a noop set during compaciton (as we already have the markset to lean on, which will be on-disk with MarkSetType = "badger") and markset-backed ones during warmup/check.

We also reset the walked block set on every iteration, to avoid building up a big one with all the blocks; we can do that because the walk is BFS.

vyzo · 2021-07-30T08:01:34Z

Opening as draft until I have tested timing/memory effects in my nodes.

vyzo · 2021-07-30T13:53:50Z

warmup with on-disk visitor is some 2x slower, takes about an hour now.

vyzo · 2021-07-30T15:14:19Z

marking is somewhat slower also; but on the bright side, there was no watchdog action whatsoever (which was a usual occurence).

vyzo · 2021-07-30T15:15:30Z

Marked as ready for review, as node burn was successful.

vyzo · 2021-07-30T19:22:50Z

Unified the markset/visitor dichotomy, which opens up interesting possibilities: we can now check+mark atomically using Visit.

Follow up: parallelize walkChain; yes we can now!

vyzo · 2021-07-30T20:54:11Z

rebasd on master.

Stebalien

The general logic looks good, but I feel like we can simplify a bit.

blockstore/splitstore/markset.go

blockstore/splitstore/markset_badger.go

Stebalien · 2021-08-03T06:07:05Z

blockstore/splitstore/markset_badger.go

+		return err
+	})
+
+	switch err {


We could always copy the version, then unlock, then do the rest of this function to simplify the locking a bit.

we have to hold the lock while doing the fast checks on the maps.

blockstore/splitstore/markset_badger.go

Stebalien · 2021-08-03T06:28:28Z

blockstore/splitstore/markset_badger.go

+	pend := s.pend
+	seqno := s.seqno
+	s.seqno++
+	s.writing[seqno] = pend


So, with a background worker, we can get rid of this and just have a single "writing" and "pending" set. When the background worker is done flushing the last set, it would check to see if it should flush the next one (e.g., use a channel as a flag).

That should help simplify things a bit.

The disadvantage of the background worker is that we can do only one write at a time -- this can do multiple concurrent writes, which is strictly better i think.

It also loses the timely error return property.

I guess we could have a worker that spawns goroutines to do writes, and somehow bubble up the error to be checked in the next operation, but this gets more complicated than I'd like it to be.

I think badger will handle write parallelism under the covers.

hrm not sure -- I prefer to be explicitly parallel and also keep the ability to return an error at some writer so that we can abort.

On the same time, I don't feel terribly strongly about this and we could use a background worker who spawns writers and bubbles up the error in some way.

blockstore/splitstore/splitstore_compact.go

vyzo · 2021-08-03T08:15:38Z

Did a number of simplifications:

The MarkSetVisitor interface was widened to be the sum of ObjectVisitor and MakrSet, thus obviating the need for a cast in splitstore code.
Removed redundant writers state variable from badger markset.
Simplified and deduplicated the code for Has and Visit by introducing tryPending and tryDB helpers.
Added ".tmp" suffix to all transient dbs in the markset directory.

vyzo · 2021-08-03T08:30:34Z

pushed forgotten commit.

Stebalien · 2021-08-03T17:25:22Z

I don't feel too strongly either. I'll take a look at the new code.

Stebalien

So, most of this patch looks fine. But the locking is still a bit tricker than I'd like. I can live with everything except taking the lock in one function and dropping it in another. That's really the only blocking change.

blockstore/splitstore/markset.go

blockstore/splitstore/markset_badger.go

Stebalien

Two non-blocking nits.

blockstore/splitstore/markset_badger.go

the walk is BFS, so we can do this!

…d tryDB.

Also avoid removing the writing set if there was an error while writing.

vyzo · 2021-08-10T07:47:40Z

rebased on master for merge, as it was a bit behind.

vyzo requested a review from Stebalien July 30, 2021 08:01

vyzo added epic/splitstore team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs labels Jul 30, 2021

vyzo mentioned this pull request Jul 30, 2021

Issues with syncing from scratch and long resyncs in the splitstore #6769

Open

vyzo marked this pull request as ready for review July 30, 2021 15:15

vyzo requested a review from a team as a code owner July 30, 2021 15:15

vyzo force-pushed the fix/splitstore-memory-usage branch from 2aa060a to 6f59511 Compare July 30, 2021 20:53

Stebalien reviewed Aug 3, 2021

View reviewed changes

Stebalien requested changes Aug 10, 2021

View reviewed changes

vyzo requested a review from Stebalien August 10, 2021 07:11

Stebalien approved these changes Aug 10, 2021

View reviewed changes

blockstore/splitstore/markset_badger.go Show resolved Hide resolved

blockstore/splitstore/markset_badger.go Show resolved Hide resolved

vyzo added 10 commits August 10, 2021 10:47

object visitor interface

6f22cff

markset-backed visitors

1323d8f

use visitors instead of cidsets in walks

49346f5

reset walked set as epoch boundaries are crossed

32d94d4

the walk is BFS, so we can do this!

deduplicate some code in markset_badger

3c994d9

unify marksets and visitors

cb3c536

take advantage of MarkSet/Visitor unification to atomically check+mark

563fa1e

improve concurrency properties of Visit with optimistic concurrency

57c984c

widen MarkSetVisitor interface and get rid of the casts

b83b540

remove redundant writers state variable

380e16d

vyzo added 6 commits August 10, 2021 10:47

simplify and deduplicate Has/Visit using helper methods tryPending an…

26a5832

…d tryDB.

add .tmp suffix to transient db names

1a59b73

deduplicate put code

a9403b4

make the write lock scope limited within a function

79f348a

Also avoid removing the writing set if there was an error while writing.

add SupportsVisitor in the markset env interface

742c85b

RIP bloom filter markset; you weren't used anyway.

eb0a62e

vyzo force-pushed the fix/splitstore-memory-usage branch from cbe19e0 to eb0a62e Compare August 10, 2021 07:47

vyzo enabled auto-merge August 10, 2021 07:47

vyzo merged commit c24d4e1 into master Aug 10, 2021

vyzo deleted the fix/splitstore-memory-usage branch August 10, 2021 08:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce splitstore memory usage during chain walks #6949

Reduce splitstore memory usage during chain walks #6949

vyzo commented Jul 30, 2021

vyzo commented Jul 30, 2021

vyzo commented Jul 30, 2021 •

edited

Loading

vyzo commented Jul 30, 2021

vyzo commented Jul 30, 2021

vyzo commented Jul 30, 2021

vyzo commented Jul 30, 2021

Stebalien left a comment

Stebalien Aug 3, 2021

vyzo Aug 3, 2021

Stebalien Aug 3, 2021

vyzo Aug 3, 2021 •

edited

Loading

vyzo Aug 3, 2021

Stebalien Aug 3, 2021

vyzo Aug 3, 2021

vyzo commented Aug 3, 2021

vyzo commented Aug 3, 2021

Stebalien commented Aug 3, 2021 via email

Stebalien left a comment

Stebalien left a comment

vyzo commented Aug 10, 2021

Reduce splitstore memory usage during chain walks #6949

Reduce splitstore memory usage during chain walks #6949

Conversation

vyzo commented Jul 30, 2021

vyzo commented Jul 30, 2021

vyzo commented Jul 30, 2021 • edited Loading

vyzo commented Jul 30, 2021

vyzo commented Jul 30, 2021

vyzo commented Jul 30, 2021

vyzo commented Jul 30, 2021

Stebalien left a comment

Choose a reason for hiding this comment

Stebalien Aug 3, 2021

Choose a reason for hiding this comment

vyzo Aug 3, 2021

Choose a reason for hiding this comment

Stebalien Aug 3, 2021

Choose a reason for hiding this comment

vyzo Aug 3, 2021 • edited Loading

Choose a reason for hiding this comment

vyzo Aug 3, 2021

Choose a reason for hiding this comment

Stebalien Aug 3, 2021

Choose a reason for hiding this comment

vyzo Aug 3, 2021

Choose a reason for hiding this comment

vyzo commented Aug 3, 2021

vyzo commented Aug 3, 2021

Stebalien commented Aug 3, 2021 via email

Stebalien left a comment

Choose a reason for hiding this comment

Stebalien left a comment

Choose a reason for hiding this comment

vyzo commented Aug 10, 2021

vyzo commented Jul 30, 2021 •

edited

Loading

vyzo Aug 3, 2021 •

edited

Loading