This repository has been archived by the owner on Nov 6, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Better logging when backfilling ancient blocks fail #10796
Merged
dvdplm
merged 22 commits into
master
from
dp/chore/debug-synching-UnlinkedAncientBlockChain
Jul 1, 2019
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
088cac8
Better logging when backfilling ancient blocks fail
dvdplm 3c13109
`finalize()` doesn't need Engine
dvdplm 020acf1
More logs
dvdplm 690b805
Clarify that the percentage may be misleading
dvdplm c516d1d
Remove replace_client_db() and replace with a straight call to restor…
dvdplm 9f6d5d4
Include the parent_hash in UnlinkedAncientBlockChain errors
dvdplm 5e2bfb9
Add a new RestorationStatus varian: Finalizing (as it can take a looo…
dvdplm 62c65aa
Add missing cases for new variant
dvdplm 947d1e4
typos
dvdplm d9ab3b9
Typo and derive Debug
dvdplm 72c58b5
Do not attempt to salvage existing blocks unless they form a complete…
dvdplm f027d4b
Fix test
dvdplm 2f5a1c6
Revert "Fix test"
dvdplm e019c8e
Fix test again
dvdplm 6bf1ad3
Update comment
dvdplm 25233d1
Merge branch 'master' into dp/chore/debug-synching-UnlinkedAncientBlo…
dvdplm 12fd856
Be careful about locks
dvdplm f577c8f
fix test failure
dvdplm 658f7ef
Merge branch 'master' into dp/chore/debug-synching-UnlinkedAncientBlo…
dvdplm 323edb7
Do not defer returning an error when the chain is broken
dvdplm fc220b0
Review feedback
dvdplm 9e17d61
no hex formatting for Option
dvdplm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should actually always deadlock - because we already have
self.restoration.lock()
acquired in line 720. parking_lot locks are not re-entrant.Instead of calling
abort_restore()
we should rather use*restoration = None
here. and maybe add a trace line if you care about it.Alternatively we could have
abort_restore
take the lockedrestoration
. IMHO if we do that for some methods already we should do that for all of them.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, for some reason we do not. It's pretty rare, I've seen it twice.
That's what
abort_restore()
does though: here. Why is it better to do it here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid locking twice?
As I said the second ption should be
abort_restore_with_restoration(locked_res: &mut Option<Restoration>)
so we can keep the logic there, but just pass the existing lock.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the reason I removed
*restoration = None
from here and replaced it with a call toabort_restore()
was to collect all shutdown actions in one place. Not thatabort_restore()
is doing very much atm, but I still think it's good to have a single point of abortion.is locking twice really a concern though: this is an error handler and I don't think we care much about performance?
The
restoration
is a member of the struct here, so I'm not sure what the point of passing it as a param tbh. Now you got me confused!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since
parking_lot::Mutex
is not re-entrant locking it twice in the same thread can (or does) lead to a deadlock. So the call like that:is guaranteed to deadlock, I thought always, but I suspect it might be just random.
In the same file I suppose we already had deadlock issues, so the
with_restoration
pattern emerged, where instead of locking resources locally in a particular function you get passed a (mutable) reference to the locked resource:I propose to use exactly the same patter for
abort_restore
- I'm totally fine grouping all the shutdown actions there, but to prevent potential double-locking and potential deadlocks we can pass the locked resource, so the first example becomes:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thank you for explaining.
Is this an option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to docs:
So I'm not sure either why it didn't always deadlock. The proposed fix looks good.