Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitstore Enhanchements #6474

Merged
merged 197 commits into from
Jul 13, 2021
Merged
Show file tree
Hide file tree
Changes from 166 commits
Commits
Show all changes
197 commits
Select commit Hold shift + click to select a range
4d3c73f
noop blockstore
vyzo Jun 11, 2021
5cca29d
hook noop blockstore for splitstore in DI
vyzo Jun 11, 2021
04f2e10
kill full splitstore compaction, simplify splitstore configuration
vyzo Jun 14, 2021
e3cbeec
implement chain walking
vyzo Mar 13, 2021
d7ceef1
decrease CompactionThreshold to 3 finalities
vyzo Mar 14, 2021
3a9b7c5
mark from current epoch to boundary epoch when necessary
vyzo Mar 14, 2021
b2b7eb2
metrics: increment misses in View().
raulk Mar 16, 2021
e9f531b
don't open bolt tracking store with NoSync, it might get corrupted
vyzo Mar 19, 2021
7cf75e6
keep genesis-linked state hot
vyzo Mar 19, 2021
bdb97d6
more robust handling of sync gap walks
vyzo Jun 14, 2021
d33a44e
first visit the cid, then short-circuit non dagcbor objects
vyzo Mar 24, 2021
fda291b
fix test
vyzo Jun 14, 2021
fa64814
reduce SyncGapTime to 1 minute
vyzo Jun 14, 2021
41573f1
also walk parent message receipts when including messages in the walk
vyzo Jun 14, 2021
7c814cd
refactor genesis state loading code into its own method
vyzo Jun 16, 2021
997f2c0
keep headers hot when running with a noop splitstore
vyzo Jun 16, 2021
7b02673
don't try to visit genesis parent blocks
vyzo Jun 16, 2021
3fe4261
don't attempt compaction while still syncing
vyzo Jun 16, 2021
9b64485
refactor warmup to trigger at startup and not wait for sync
vyzo Jun 16, 2021
421f05e
save the warm up epoch only if successful in warming up
vyzo Jun 16, 2021
bb17608
track writeEpoch relative to current wall clock time
vyzo Jun 17, 2021
66f1630
fix lint issue
vyzo Jun 17, 2021
933c786
update write epoch in the background every second
vyzo Jun 17, 2021
b789759
augment current epoch by +1
vyzo Jun 17, 2021
c4d95de
coalesce back-to-back compactions
vyzo Jun 17, 2021
a178c1f
fix test
vyzo Jun 17, 2021
a25ac80
reintroduce compaction slack
vyzo Jun 17, 2021
79d2148
fix test
vyzo Jun 17, 2021
a21f559
CompactionThreshold should be 4 finalities
vyzo Jun 17, 2021
30dbe49
adjust compaction range
vyzo Jun 21, 2021
0390285
always do full walks, not only when there is a sync gap
vyzo Jun 21, 2021
fc247e4
add debug log skeleton
vyzo Jun 21, 2021
fce7b8d
flush move log when cold collection is done
vyzo Jun 22, 2021
a53c4e1
implement debug log
vyzo Jun 22, 2021
b187b5c
fix lint
vyzo Jun 22, 2021
375a179
reset counters after flush
vyzo Jun 22, 2021
50ebaf2
don't log read misses before warmup
vyzo Jun 22, 2021
649b7dd
add config option for hot headers
vyzo Jun 23, 2021
cb665d0
fix transactional race during compaction
vyzo Jun 25, 2021
65ccc99
minor tweaks in purge
vyzo Jun 25, 2021
6af3a23
use a map for txn protection mark set
vyzo Jun 25, 2021
31497f4
use internal get during walk to avoid blowing the compaction txn
vyzo Jun 27, 2021
4a71c68
move code around for better readability
vyzo Jun 27, 2021
9fda61a
fix error check for unreachable cids
vyzo Jun 28, 2021
40ff5bf
log put errors in splitstore log
vyzo Jun 28, 2021
7ebef6d
better log message
vyzo Jun 28, 2021
dec61fa
deduplicate stack logs and optionally trace write stacks
vyzo Jun 29, 2021
0b315e9
fix index out of range
vyzo Jun 29, 2021
b2b13bb
fix debug panic
vyzo Jun 29, 2021
57e25ae
use succint timetamp in debug logs
vyzo Jun 29, 2021
7307eb5
cache stack repr computation
vyzo Jun 29, 2021
4bed316
fix broken purge count log
vyzo Jun 30, 2021
e29b64c
check both markset and txn liveset before declaring an object cold
vyzo Jun 30, 2021
7de0771
count txn live objects explicitly for logging
vyzo Jul 1, 2021
09efed5
check for lookback references to block headers in walk
vyzo Jul 1, 2021
40f42db
walk tweaks
vyzo Jul 1, 2021
90dc274
better logging for chain walk
vyzo Jul 1, 2021
f97535d
store the hash in map markset
vyzo Jul 2, 2021
6a3cbea
treat Has as an implicit Write
vyzo Jul 2, 2021
e472cac
add missing return
vyzo Jul 2, 2021
be6cc2c
batch implicit write tracking
vyzo Jul 2, 2021
a29947d
flush implicit writes in all paths in updateWriteEpoch
vyzo Jul 2, 2021
7f473f5
flush implicit writes before starting compaction
vyzo Jul 2, 2021
d0bfe42
flush implicit writes at the right time before starting compaction to…
vyzo Jul 2, 2021
3e8e927
track all writes using async batching, not just implicit ones
vyzo Jul 2, 2021
aeaa59d
move comments about tracking perf issues into a more pertinent place
vyzo Jul 2, 2021
2faa4aa
debug log writes at track so that we get correct stack traces
vyzo Jul 2, 2021
b3ddaa5
fix panic at startup
vyzo Jul 2, 2021
4de0cd9
move write log back to flush so that we don't crawl to a halt
vyzo Jul 2, 2021
9828673
transitively track dags from implicit writes in Has
vyzo Jul 2, 2021
13a6743
add pending write check before tracking the object in Has
vyzo Jul 2, 2021
a98a062
do the dag walk for deep write tracking during flush
vyzo Jul 2, 2021
bd92c23
refactor txn reference tracking, do deep marking of DAGs
vyzo Jul 2, 2021
4071488
first write, then track
vyzo Jul 2, 2021
da00fc6
downgrade a couple of logs to warnings
vyzo Jul 2, 2021
1d41e15
optimize transitive write tracking a bit
vyzo Jul 2, 2021
484dfae
reused cidset across all walks when flushing pending writes
vyzo Jul 2, 2021
9d6bcd7
avoid clown shoes: only walk links for tracking in implicit writes/refs
vyzo Jul 2, 2021
637fbf6
fix faulty if/else logic for implicit txn protection
vyzo Jul 2, 2021
b87295d
bubble up dependent txn ref errors
vyzo Jul 2, 2021
68bc5d2
skip moving cold blocks when running with a noop coldstore
vyzo Jul 2, 2021
e4bb4be
fix some residual purge races
vyzo Jul 3, 2021
5834231
create the transactional protect filter before walking
vyzo Jul 3, 2021
39723bb
use a single map for tracking pending writes, properly track implicits
vyzo Jul 3, 2021
736d6a3
only treat Has as an implicit write within vm.Copy context
vyzo Jul 3, 2021
8157f88
short-circuit marking walks when encountering a block and more effici…
vyzo Jul 3, 2021
9d6cabd
if it's not a dag, it's not a block
vyzo Jul 3, 2021
228a435
rework tracking logic; do it lazily and far more efficiently
vyzo Jul 3, 2021
184d380
remove dead code
vyzo Jul 3, 2021
2b03316
fix log message
vyzo Jul 3, 2021
6f58fdc
remove vm copy context detection hack
vyzo Jul 3, 2021
d79e4da
more accurate stats about mark set updates
vyzo Jul 3, 2021
c5cf8e2
remove unnecessary code
vyzo Jul 3, 2021
642f0e4
deal with memory pressure, don't walk under the boundary
vyzo Jul 4, 2021
00fcf6d
add staging cache to bolt tracking store
vyzo Jul 4, 2021
68a8350
fix bug that turned candidate filtering to dead code
vyzo Jul 4, 2021
d476a3d
BlockstoreIterator trait with implementation for badger
vyzo Jul 4, 2021
1f2b604
RIP tracking store
vyzo Jul 4, 2021
5f7ae1f
update splistore DI constructor
vyzo Jul 4, 2021
6fa2cd2
simplify compaction model
vyzo Jul 4, 2021
36f9364
fix panic from concurrent map writes in txnRefs
vyzo Jul 4, 2021
eafffc1
more efficient trackTxnRefMany
vyzo Jul 4, 2021
08cad30
reuse key buffer in badger ForEachKey
vyzo Jul 4, 2021
0a1d7b3
fix log
vyzo Jul 4, 2021
19d1b1f
deal with partially written objects
vyzo Jul 4, 2021
190cb18
housekeeping
vyzo Jul 4, 2021
95c3aae
fix test
vyzo Jul 4, 2021
8e56fff
walkChain should visit the genesis state root
vyzo Jul 4, 2021
028a5c4
make test do something useful again
vyzo Jul 4, 2021
2c7a89a
short-circuit rescanning on block headers
vyzo Jul 4, 2021
1f02428
fix lint
vyzo Jul 4, 2021
680af8e
use deep object walking for more robust handling of transactional ref…
vyzo Jul 4, 2021
4d286da
fix error message
vyzo Jul 4, 2021
f124389
recursively protect all references
vyzo Jul 4, 2021
13d612f
smarter trackTxnRefMany
vyzo Jul 4, 2021
40c271c
sort cold objects before deleting
vyzo Jul 4, 2021
f33d4e7
simplify transactional protection logic
vyzo Jul 4, 2021
94efae4
reduce length of critical section
vyzo Jul 4, 2021
b08e0b7
fix lint
vyzo Jul 4, 2021
db53859
reduce CompactionThreshold to 5 finalities
vyzo Jul 4, 2021
1726eb9
deal with incomplete objects that need to be marked and protected
vyzo Jul 5, 2021
3597192
remove the sleeps and busy loop more times when waiting for missing o…
vyzo Jul 5, 2021
4c41f52
add warning for missing objects for marking for debug purposes
vyzo Jul 5, 2021
c81ae5f
add some comments about the missing business and anothre log
vyzo Jul 5, 2021
839f7bd
only occur check for DAGs
vyzo Jul 5, 2021
2ea2abc
short-circuit fil commitments
vyzo Jul 5, 2021
918a7ec
a bit more fil commitment short-circuiting
vyzo Jul 5, 2021
3ec834b
improve logs and error messages
vyzo Jul 5, 2021
d7709de
reduce memory pressure from marksets when the size is decreased
vyzo Jul 5, 2021
d8b8d75
readd minute delay before trying for missing objects
vyzo Jul 5, 2021
0b7153b
use internal version of has for occurs checks
vyzo Jul 5, 2021
59936ef
fix log
vyzo Jul 5, 2021
fa195be
get rid of ugly missing reference handling code
vyzo Jul 5, 2021
59639a0
reinstate some better code for handling missing references.
vyzo Jul 5, 2021
5a099b7
more commentary on the missing refs situation
vyzo Jul 5, 2021
af8cf71
handle all missing refs together
vyzo Jul 5, 2021
73d0799
dont needlessly wait 1 min in first retry for missing refs
vyzo Jul 5, 2021
3477d26
unify the two marksets
vyzo Jul 5, 2021
e859942
code cleanup: refactor txn state code into their own functions
vyzo Jul 5, 2021
bd436ab
make endTxnProtect idempotent
vyzo Jul 5, 2021
51ab891
quiet linter
vyzo Jul 5, 2021
2cbd3fa
make sure to nil everything in txnEndProtect
vyzo Jul 5, 2021
c6ad8fd
use walkObjectRaw for computing object weights
vyzo Jul 5, 2021
525a2c7
use hashes as keys in weight map to avoid duplicate work
vyzo Jul 5, 2021
0659235
cache cid strings in sort
vyzo Jul 6, 2021
bf7aeb3
optimize sort a tad
vyzo Jul 6, 2021
55a9e0c
short-circuit block headers on sort weight computation
vyzo Jul 6, 2021
169ab26
really optimize computing object weights
vyzo Jul 6, 2021
c4ae3e0
minor tweak
vyzo Jul 6, 2021
dc8139a
add some comments for debug only code
vyzo Jul 6, 2021
5c51450
remove unused GetGenesis method from ChainAccessor interface
vyzo Jul 6, 2021
fdff1be
move map markset implementation to its own file
vyzo Jul 6, 2021
c1c2586
improve comments
vyzo Jul 6, 2021
f2f4af6
clean up: simplify debug log, get rid of ugly debug log
vyzo Jul 6, 2021
0e2af11
prepare the transaction before launching the compaction goroutine
vyzo Jul 6, 2021
90da622
transactional protect incoming tipsets
vyzo Jul 6, 2021
05dbbe9
rename som Txn methods for better readability
vyzo Jul 7, 2021
6cc2112
remove the curTs state variable; we don't need it
vyzo Jul 7, 2021
83c30dc
protect assignment of warmup epoch with the mutex
vyzo Jul 7, 2021
9dbb2e0
don't leak tracking errors through the API
vyzo Jul 7, 2021
451ddf5
RIP bbolt-backed markset
vyzo Jul 7, 2021
ec586a8
remove bbolt dependency from go.mod
vyzo Jul 7, 2021
aec2ba2
nil map/bf on markset close
vyzo Jul 7, 2021
c6421f8
don't nil the mark sets on close, it's dangerous.
vyzo Jul 7, 2021
fee50b1
check the closing state on each batch during the purge.
vyzo Jul 7, 2021
4f80836
fix lint
vyzo Jul 7, 2021
f5c45bd
check the closing state variable often
vyzo Jul 8, 2021
48f13a4
intelligently close marksets and signal errors in concurrent operations
vyzo Jul 8, 2021
e6eacbd
use RW mutexes in marksets
vyzo Jul 8, 2021
9aa4f3b
add README for documentation
vyzo Jul 8, 2021
5cf1e09
README: add instructions for how to enable
vyzo Jul 8, 2021
00d7772
move check for closure in walkChain
vyzo Jul 8, 2021
fa30ac8
fix typo
vyzo Jul 8, 2021
c053784
fix typo
vyzo Jul 8, 2021
60dd97c
fix potential deadlock in View
vyzo Jul 8, 2021
b661112
add environment variables to turn on the debug log without recompiling
vyzo Jul 8, 2021
abdf4a1
explicitly switch marksets for concurrent marking
vyzo Jul 9, 2021
909f703
make badger Close-safe
vyzo Jul 9, 2021
de5e21b
correctly handle identity cids
vyzo Jul 9, 2021
4f89d26
kill isOldBlockHeader; it's dangerous.
vyzo Jul 9, 2021
565faff
fix test
vyzo Jul 9, 2021
acc4c37
properly handle protecting long-running views
vyzo Jul 9, 2021
da0feb3
dont mark references inline; instad rely on the main compaction threa…
vyzo Jul 9, 2021
095d742
make view protection optimistic again, as there is a race window
vyzo Jul 9, 2021
18161fe
remove unused lookback constructs
vyzo Jul 9, 2021
c0a1cff
rename noopstore to discardstore
vyzo Jul 9, 2021
b9a5ea8
update wording around discard store
vyzo Jul 9, 2021
4129038
fix test
vyzo Jul 9, 2021
f5ae10e
refactor debug log code to eliminate duplication
vyzo Jul 9, 2021
870a47f
handle id cids in internal versions of view/get
vyzo Jul 9, 2021
0c5e336
address review comments
vyzo Jul 10, 2021
df9670c
fix lint
vyzo Jul 10, 2021
759594d
always return the waitgroup in protectView
vyzo Jul 13, 2021
60212c8
put a mutex around HeadChange
vyzo Jul 13, 2021
04abd19
nit: remove useless goto
Stebalien Jul 13, 2021
257423e
fix view waiting issues with the WaitGroup
vyzo Jul 13, 2021
af39952
finetune view waiting
vyzo Jul 13, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions blockstore/badger/blockstore.go
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ type Blockstore struct {

var _ blockstore.Blockstore = (*Blockstore)(nil)
var _ blockstore.Viewer = (*Blockstore)(nil)
var _ blockstore.BlockstoreIterator = (*Blockstore)(nil)
var _ io.Closer = (*Blockstore)(nil)

// Open creates a new badger-backed blockstore, with the supplied options.
Expand Down Expand Up @@ -442,6 +443,55 @@ func (b *Blockstore) AllKeysChan(ctx context.Context) (<-chan cid.Cid, error) {
return ch, nil
}

// Implementation of BlockstoreIterator interface
func (b *Blockstore) ForEachKey(f func(cid.Cid) error) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: it's OK to have end-to-end transactionality here because Badger implements concurrent transactions through MVCC and does not stop the world while this iterator is open AFAIK. (even if it were STW, it was OK in AllKeysChan because that method was never used in a production codepath; however, ForEachKey is used in a production codepath, so it wouldn't be).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it doesn't seem to block anything from happening.

if atomic.LoadInt64(&b.state) != stateOpen {
vyzo marked this conversation as resolved.
Show resolved Hide resolved
return ErrBlockstoreClosed
}

txn := b.DB.NewTransaction(false)
defer txn.Discard()

opts := badger.IteratorOptions{PrefetchSize: 100}
if b.prefixing {
opts.Prefix = b.prefix
}

iter := txn.NewIterator(opts)
defer iter.Close()

var buf []byte
for iter.Rewind(); iter.Valid(); iter.Next() {
if atomic.LoadInt64(&b.state) != stateOpen {
return ErrBlockstoreClosed
}

k := iter.Item().Key()
if b.prefixing {
k = k[b.prefixLen:]
}

klen := base32.RawStdEncoding.DecodedLen(len(k))
if klen > len(buf) {
buf = make([]byte, klen)
}

n, err := base32.RawStdEncoding.Decode(buf, k)
if err != nil {
return err
}

c := cid.NewCidV1(cid.Raw, buf[:n])

err = f(c)
if err != nil {
return err
}
}

return nil
}

// HashOnRead implements Blockstore.HashOnRead. It is not supported by this
// blockstore.
func (b *Blockstore) HashOnRead(_ bool) {
Expand Down
5 changes: 5 additions & 0 deletions blockstore/blockstore.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,11 @@ type BatchDeleter interface {
DeleteMany(cids []cid.Cid) error
}

// BlockstoreIterator is a trait for efficient iteration
type BlockstoreIterator interface {
ForEachKey(func(cid.Cid) error) error
vyzo marked this conversation as resolved.
Show resolved Hide resolved
}

// WrapIDStore wraps the underlying blockstore in an "identity" blockstore.
// The ID store filters out all puts for blocks with CIDs using the "identity"
// hash function. It also extracts inlined blocks from CIDs using the identity
Expand Down
66 changes: 66 additions & 0 deletions blockstore/noop.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
package blockstore

import (
"context"
"io"

blocks "github.com/ipfs/go-block-format"
cid "github.com/ipfs/go-cid"
)

var _ Blockstore = (*noopstore)(nil)

type noopstore struct {
vyzo marked this conversation as resolved.
Show resolved Hide resolved
bs Blockstore
}

func NewNoopStore(bs Blockstore) Blockstore {
vyzo marked this conversation as resolved.
Show resolved Hide resolved
return &noopstore{bs: bs}
}

func (b *noopstore) Has(cid cid.Cid) (bool, error) {
return b.bs.Has(cid)
}

func (b *noopstore) HashOnRead(hor bool) {
b.bs.HashOnRead(hor)
}

func (b *noopstore) Get(cid cid.Cid) (blocks.Block, error) {
return b.bs.Get(cid)
}

func (b *noopstore) GetSize(cid cid.Cid) (int, error) {
return b.bs.GetSize(cid)
}

func (b *noopstore) View(cid cid.Cid, f func([]byte) error) error {
return b.bs.View(cid, f)
}

func (b *noopstore) Put(blk blocks.Block) error {
return nil
}

func (b *noopstore) PutMany(blks []blocks.Block) error {
return nil
}

func (b *noopstore) DeleteBlock(cid cid.Cid) error {
return nil
}

func (b *noopstore) DeleteMany(cids []cid.Cid) error {
return nil
}

func (b *noopstore) AllKeysChan(ctx context.Context) (<-chan cid.Cid, error) {
return b.bs.AllKeysChan(ctx)
}

func (b *noopstore) Close() error {
if c, ok := b.bs.(io.Closer); ok {
return c.Close()
}
return nil
}
Loading