-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: don't lock in-memory shares storage for database ops #1778
base: stage
Are you sure you want to change the base?
Conversation
47b9cfb
to
dd28280
Compare
@moshe-blox I'm looking at ways to move the interaction with the database in a separate goroutine. |
One pattern I've seen for this is:
It does complicate
It doesn't really change the underlying assumptions this PR introduces - which is the consistency between in-memory data and DB data becomes eventual (meaning, you can call |
@anatolie-ssv oh that's right during event sync we can't ignore DB errors, i forgot! @iurii-ssv returning i think going for a pure solution might be too complex, because what does it mean if a database save fail — should we undo the changes we've done in the map? maybe we should try to target the specific issue we have and witnessed causing a slowdown in the real world — that updating of thousands of share metadatas completely locks the storage for many seconds at a time i'm not aware of other points of pressure around shares than that (so far!) WDYT? |
Yeah, that's what I was getting at. If the performance of fully blocking implementation is sufficient - there is no need to go for the harder solution.
Yup, the simpler the better. I was just describing a potential solution (with it's tradeoffs) if we need to go down this path sometime in the future, because @anatolie-ssv mentioned he might want to consider it. |
@iurii-ssv i've been discussing with @anatolie-ssv and indeed we're throwing away my original proposal, and instead expanding on his initial implementation with a few improvements (such as promoting the db mutex to being a parent lock of the in-memory mutex), which he's still playing around with to confirm it makes sense hopefully @anatolie-ssv can fill you in on the details or present it once it's ready :) |
5ad0a20
to
688454c
Compare
161c1f3
to
ef25bf0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but it seems we lock dbmu
for longer times than necessary (I assume it's only goal is to manage access to sharesStorage.db
).
registry/storage/shares.go
Outdated
dbmu sync.Mutex // parent lock for in-memory mutex | ||
mu sync.RWMutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps more detailed descriptions wouldn't hurt (because it's not quite clear what parent really means here):
// dbMtx is used to synchronize access to db
dbMtx sync.Mutex
// sMtx is used to synchronize access to shares
sMtx sync.RWMutex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd suggest renaming to something like storageMtx
and memoryMtx
because it explains the fundamental difference between the two
d53f013
to
3257ec3
Compare
if err := s.validatorStore.handleSharesUpdated(updateShares...); err != nil { | ||
return err | ||
} | ||
|
||
if err := s.validatorStore.handleSharesAdded(addShares...); err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to wrap errors here (because they come from different code branches might be hard to debug without it):
if err := s.validatorStore.handleSharesUpdated(updateShares...); err != nil {
return fmt.Erorrf("handleSharesUpdated: %w", err)
}
if err := s.validatorStore.handleSharesAdded(addShares...); err != nil {
return fmt.Erorrf("handleSharesAdded: %w", err)
}
@@ -210,48 +210,57 @@ func (s *sharesStorage) Save(rw basedb.ReadWriter, shares ...*types.SSVShare) er | |||
} | |||
} | |||
|
|||
s.mu.Lock() | |||
defer s.mu.Unlock() | |||
s.logger.Debug("save validators to shares storage", zap.Int("count", len(shares))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: save validators to shares storage
-> save validator shares to shares storage
3257ec3
to
2be0c5a
Compare
The approach taken here is to minimize the time spent with a locked
shares
map.By separately locking the badger instance we allow access to the in-memory shares while the "heavy lifting" disk access is being executed.