-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dkg: sync privkeylock before exiting #2257
Conversation
… deletion Add `Done()` method that blocks until the `Run()` method finishes its course. `Run()` is supposed to be executed in a goroutine, and since we can't control the scheduler it might be scheduled *after* the `main` one exits, leading to the `ctx.Done()` code path to never be executed. Add `Done()`, caller can block on untile `Run()` finishes its execution. Callers must cancel the `Run()` context, then call `Done()`.
Also modifies the DKG run test to check that all privkey locks are deleted.
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #2257 +/- ##
==========================================
+ Coverage 53.83% 53.86% +0.03%
==========================================
Files 188 188
Lines 25513 25524 +11
==========================================
+ Hits 13734 13749 +15
Misses 10092 10092
+ Partials 1687 1683 -4
☔ View full report in Codecov by Sentry. |
Co-authored-by: Abhishek Kumar <[email protected]>
Co-authored-by: Abhishek Kumar <[email protected]>
@gsora hear me out on this:
// Service is a private key locking service.
type Service struct {
...
quit chan struct{} // New
}
func (s *Service) Done() {
close(s.quit)
}
case <-ctx.Done():
cleanup = true
case <-h.quit:
cleanup = true
It essentially does the same thing as the waitgroup but the Wdyt? |
That is not correct: since the deletion happens in a different goroutine you are not sure if it'll be scheduled. I just implemented your solution as a test, and while it does work more often than just relying on The idea here is having the |
I did implement a channel-based solution in a call with @dB2510: it worked fine. I fell back on A thing we could do to make it more obvious is spawning the background goroutine inside |
Actually thinking about loud, maybe this fix shouldn't be in Implementing a different solution as we speak. |
…dn't have dkg has the concurrency problem, so make it responsible for the fix as well.
Co-authored-by: Abhishek Kumar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I think this is a more simpler approach 👍
}() | ||
|
||
// Stop it on exit. | ||
defer lockSvc.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move this line before go func, after we initialise lockSvc
go func(ctx context.Context) { | ||
if err := lockSvc.Run(ctx); err != nil { | ||
log.Error(ctx, "Error locking private key file", err) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@corverroos why do we need a separate block for this?
} | ||
}(ctx) | ||
|
||
// Start it async |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Start it async | |
// Start private key lock service async. |
Block
dkg
exit untilprivkeylock.Run()
enclosing goroutine exits, ensuring that the private key lock file is always deleted.Run()
is supposed to be executed in a goroutine, and since we can't control the scheduler it might be scheduled after themain
one exits, leading to thectx.Done()
code path to sometimes never be executed.category: bug
ticket: #2258