-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ChainDB: add blocks asynchronously #1709
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
See `lengthTBQueueDefault` and `MonadSTMTxExtended` for more information.
mrBliss
commented
Feb 27, 2020
ouroboros-consensus/src/Ouroboros/Consensus/Storage/ChainDB/API.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/Ouroboros/Consensus/Storage/ChainDB/API.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/Ouroboros/Consensus/Storage/ChainDB/Impl/Types.hs
Show resolved
Hide resolved
ouroboros-consensus/src/Ouroboros/Consensus/Storage/ChainDB/Impl/Types.hs
Outdated
Show resolved
Hide resolved
mrBliss
force-pushed
the
mrBliss/chaindb-async-addblock
branch
from
February 27, 2020 13:17
8f9e6d7
to
6996372
Compare
mrBliss
commented
Feb 27, 2020
ouroboros-consensus/src/Ouroboros/Consensus/Storage/ChainDB/Impl/ChainSel.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/Ouroboros/Consensus/Storage/ChainDB/Impl/ChainSel.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus/src/Ouroboros/Consensus/Storage/ChainDB/Impl/ChainSel.hs
Show resolved
Hide resolved
ouroboros-consensus/src/Ouroboros/Consensus/Storage/ChainDB/Impl/Args.hs
Outdated
Show resolved
Hide resolved
Fixes #1463. Instead of adding blocks synchronously, they are now put into a queue, after which `addBlockAsync` returns an `AddBlockPromise`, which can be used to wait until the block has been processed. A background thread will read the blocks from the queue and add them synchronously to the ChainDB. The queue is limited in size; when it is full, callers of `addBlockAsync` might still have to wait. With this asynchronous approach, threads adding blocks asynchronously can be killed without worries, the background thread processing the blocks synchronously won't be killed. Only when the whole ChainDB shuts down will that background thread get killed. But since there will be no more in-memory state, it can't get out of sync with the file system state. On the next startup, a correct in-memory state will be reconstructed from the file system state. By letting the BlockFetchClient add blocks asynchronously, we also get a 20-40% bulk chain sync speed-up in some microbenchmarks.
edsko
approved these changes
Feb 27, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed over hangout (made comments off-line as GitHub was having difficulties).
bors r+ |
iohk-bors bot
added a commit
that referenced
this pull request
Feb 27, 2020
1709: ChainDB: add blocks asynchronously r=edsko a=mrBliss Fixes #1463. Instead of adding blocks synchronously, they are now put into a queue, after which `addBlockAsync` returns an `AddBlockResult`, which can be used to wait until the block has been processed. A background thread will read the blocks from the queue and add them synchronously to the ChainDB. The queue is limited in size; when it is full, callers of `addBlockAsync` might still have to wait. With this asynchronous approach, threads adding blocks asynchronously can be killed without worries, the background thread processing the blocks synchronously won't be killed. Only when the whole ChainDB shuts down will that background thread get killed. But since there will be no more in-memory state, it can't get out of sync with the file system state. On the next startup, a correct in-memory state will be reconstructed from the file system state. By letting the BlockFetchClient add blocks asynchronously, we also get a 20-40% bulk chain sync speed-up in some microbenchmarks. Co-authored-by: Thomas Winant <[email protected]>
This PR actually got merged, but because of GitHub's recent reliability problems, this PR is not aware of that. |
mrBliss
added a commit
that referenced
this pull request
Apr 29, 2020
Previously, we knew the current slot and were able to tell that a block was from the future by comparing the block's slot against the current slot. For such blocks we would schedule a chain selection at the block's slot, which would be performed by a background thread. Now, we no longer know the current slot. Instead, we validate candidate chains and use the resulting ledgers to call `CheckInFuture`, which returns the headers in the candidate fragment that are from the future. We truncate these headers from the fragment, record that they're from the future (`cdbFutureBlocks`), and repeat chain selection without them. Headers that are too far from the future, i.e., exceeding the max clock skew, are recorded as invalid blocks (with `InFutureExceedsClockSkew` as the `InvalidBlockReason`). For each new block we receive, we perform chain selection for all future blocks before performing chain selection for the new block. * Split off `CandidateSuffix` into a separate module and use it throughout chain selection instead of only partially. A `CandidateSuffix` is the number of headers to roll back the current chain + a fragment containing the new headers to add, like a diff w.r.t. the current chain. Previously, we converted such a `CandidateSuffix` to a `ChainAndLedger`, i.e., a fragment starting from the immutable tip (typically containing >= k headers) + a ledger matching the tip. Now, we stick to the `CandidateSuffix` until the end, when we actually install the candidate as the new chain by applying the diff. Also introduce `CandidateSuffixAndLedger` and use that instead of `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger` for the current chain. * Simplify `trySwitchTo` because there is no concurrency thanks to the queue introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`. * New trace messages: - `ChainSelectionForFutureBlock` - `CandidateContainsFutureBlocks` - `CandidateContainsFutureBlocksExceedingClockSkew` * Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really used and complicated our new handling of blocks from the future.
mrBliss
added a commit
that referenced
this pull request
Apr 30, 2020
Previously, we knew the current slot and were able to tell that a block was from the future by comparing the block's slot against the current slot. For such blocks we would schedule a chain selection at the block's slot, which would be performed by a background thread. Now, we no longer know the current slot. Instead, we validate candidate chains and use the resulting ledgers to call `CheckInFuture`, which returns the headers in the candidate fragment that are from the future. We truncate these headers from the fragment, record that they're from the future (`cdbFutureBlocks`), and repeat chain selection without them. Headers that are too far into the future, i.e., exceeding the max clock skew, are not recorded in `cdbFutureBlocks`, but are recorded as invalid blocks (with `InFutureExceedsClockSkew` as the `InvalidBlockReason`). For each new block we receive, we perform chain selection for all future blocks before performing chain selection for the new block. * Rename `CandidateSuffix` to `ChainDiff`, split it off into a separate module, and use it throughout chain selection instead of only partially. A `ChainDiff` is the number of headers to roll back the current chain + a fragment containing the new headers to add, i.e., a diff w.r.t. the current chain. Previously, we converted such a `ChainDiff` to a `ChainAndLedger`, i.e., a fragment starting from the immutable tip (typically containing >= k headers) + a ledger matching the tip. Now, we stick to the `ChainDiff` until the end, when we actually install the candidate as the new chain by applying the diff. Also introduce `ValidatedChainDiff` and use that instead of `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger` for the current chain. * Simplify `trySwitchTo` because there is no concurrency thanks to the queue introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`. * New trace messages: - `ChainSelectionForFutureBlock` - `CandidateContainsFutureBlocks` - `CandidateContainsFutureBlocksExceedingClockSkew` * Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really used and complicated our new handling of blocks from the future. * Don't mark successors of an invalid block as invalid, as this is redundant, see why in `ChainDB.md`. This means we remove the `InChainAfterInvalidBlock` constructor of `InvalidBlockReason`. * Introduce `ChainSelEnv` to reduce the number of parameters to pass around.
mrBliss
added a commit
that referenced
this pull request
Apr 30, 2020
Previously, we knew the current slot and were able to tell that a block was from the future by comparing the block's slot against the current slot. For such blocks we would schedule a chain selection at the block's slot, which would be performed by a background thread. Now, we no longer know the current slot. Instead, we validate candidate chains and use the resulting ledgers to call `CheckInFuture`, which returns the headers in the candidate fragment that are from the future. We truncate these headers from the fragment, record that they're from the future (`cdbFutureBlocks`), and repeat chain selection without them. Headers that are too far into the future, i.e., exceeding the max clock skew, are not recorded in `cdbFutureBlocks`, but are recorded as invalid blocks (with `InFutureExceedsClockSkew` as the `InvalidBlockReason`). For each new block we receive, we perform chain selection for all future blocks before performing chain selection for the new block. * Rename `CandidateSuffix` to `ChainDiff`, split it off into a separate module, and use it throughout chain selection instead of only partially. A `ChainDiff` is the number of headers to roll back the current chain + a fragment containing the new headers to add, i.e., a diff w.r.t. the current chain. Previously, we converted such a `ChainDiff` to a `ChainAndLedger`, i.e., a fragment starting from the immutable tip (typically containing >= k headers) + a ledger matching the tip. Now, we stick to the `ChainDiff` until the end, when we actually install the candidate as the new chain by applying the diff. Also introduce `ValidatedChainDiff` and use that instead of `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger` for the current chain. * Simplify `trySwitchTo` because there is no concurrency thanks to the queue introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`. * New trace messages: - `ChainSelectionForFutureBlock` - `CandidateContainsFutureBlocks` - `CandidateContainsFutureBlocksExceedingClockSkew` * Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really used and complicated our new handling of blocks from the future. * Don't mark successors of an invalid block as invalid, as this is redundant, see why in `ChainDB.md`. This means we remove the `InChainAfterInvalidBlock` constructor of `InvalidBlockReason`. * Introduce `ChainSelEnv` to reduce the number of parameters to pass around.
mrBliss
added a commit
that referenced
this pull request
Apr 30, 2020
Previously, we knew the current slot and were able to tell that a block was from the future by comparing the block's slot against the current slot. For such blocks we would schedule a chain selection at the block's slot, which would be performed by a background thread. Now, we no longer know the current slot. Instead, we validate candidate chains and use the resulting ledgers to call `CheckInFuture`, which returns the headers in the candidate fragment that are from the future. We truncate these headers from the fragment, record that they're from the future (`cdbFutureBlocks`), and repeat chain selection without them. Headers that are too far into the future, i.e., exceeding the max clock skew, are not recorded in `cdbFutureBlocks`, but are recorded as invalid blocks (with `InFutureExceedsClockSkew` as the `InvalidBlockReason`). For each new block we receive, we perform chain selection for all future blocks before performing chain selection for the new block. * Rename `CandidateSuffix` to `ChainDiff`, split it off into a separate module, and use it throughout chain selection instead of only partially. A `ChainDiff` is the number of headers to roll back the current chain + a fragment containing the new headers to add, i.e., a diff w.r.t. the current chain. Previously, we converted such a `ChainDiff` to a `ChainAndLedger`, i.e., a fragment starting from the immutable tip (typically containing >= k headers) + a ledger matching the tip. Now, we stick to the `ChainDiff` until the end, when we actually install the candidate as the new chain by applying the diff. Also introduce `ValidatedChainDiff` and use that instead of `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger` for the current chain. * Simplify `trySwitchTo` because there is no concurrency thanks to the queue introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`. * New trace messages: - `ChainSelectionForFutureBlock` - `CandidateContainsFutureBlocks` - `CandidateContainsFutureBlocksExceedingClockSkew` * Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really used and complicated our new handling of blocks from the future. * Don't mark successors of an invalid block as invalid, as this is redundant, see why in `ChainDB.md`. This means we remove the `InChainAfterInvalidBlock` constructor of `InvalidBlockReason`. * Introduce `ChainSelEnv` to reduce the number of parameters to pass around.
mrBliss
added a commit
that referenced
this pull request
Apr 30, 2020
Previously, we knew the current slot and were able to tell that a block was from the future by comparing the block's slot against the current slot. For such blocks we would schedule a chain selection at the block's slot, which would be performed by a background thread. Now, we no longer know the current slot. Instead, we validate candidate chains and use the resulting ledgers to call `CheckInFuture`, which returns the headers in the candidate fragment that are from the future. We truncate these headers from the fragment, record that they're from the future (`cdbFutureBlocks`), and repeat chain selection without them. Headers that are too far into the future, i.e., exceeding the max clock skew, are not recorded in `cdbFutureBlocks`, but are recorded as invalid blocks (with `InFutureExceedsClockSkew` as the `InvalidBlockReason`). For each new block we receive, we perform chain selection for all future blocks before performing chain selection for the new block. * Rename `CandidateSuffix` to `ChainDiff`, split it off into a separate module, and use it throughout chain selection instead of only partially. A `ChainDiff` is the number of headers to roll back the current chain + a fragment containing the new headers to add, i.e., a diff w.r.t. the current chain. Previously, we converted such a `ChainDiff` to a `ChainAndLedger`, i.e., a fragment starting from the immutable tip (typically containing >= k headers) + a ledger matching the tip. Now, we stick to the `ChainDiff` until the end, when we actually install the candidate as the new chain by applying the diff. Also introduce `ValidatedChainDiff` and use that instead of `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger` for the current chain. * Simplify `trySwitchTo` because there is no concurrency thanks to the queue introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`. * New trace messages: - `ChainSelectionForFutureBlock` - `CandidateContainsFutureBlocks` - `CandidateContainsFutureBlocksExceedingClockSkew` * Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really used and complicated our new handling of blocks from the future. * Don't mark successors of an invalid block as invalid, as this is redundant, see why in `ChainDB.md`. This means we remove the `InChainAfterInvalidBlock` constructor of `InvalidBlockReason`. * Introduce `ChainSelEnv` to reduce the number of parameters to pass around.
edsko
pushed a commit
that referenced
this pull request
May 1, 2020
Previously, we knew the current slot and were able to tell that a block was from the future by comparing the block's slot against the current slot. For such blocks we would schedule a chain selection at the block's slot, which would be performed by a background thread. Now, we no longer know the current slot. Instead, we validate candidate chains and use the resulting ledgers to call `CheckInFuture`, which returns the headers in the candidate fragment that are from the future. We truncate these headers from the fragment, record that they're from the future (`cdbFutureBlocks`), and repeat chain selection without them. Headers that are too far into the future, i.e., exceeding the max clock skew, are not recorded in `cdbFutureBlocks`, but are recorded as invalid blocks (with `InFutureExceedsClockSkew` as the `InvalidBlockReason`). For each new block we receive, we perform chain selection for all future blocks before performing chain selection for the new block. * Rename `CandidateSuffix` to `ChainDiff`, split it off into a separate module, and use it throughout chain selection instead of only partially. A `ChainDiff` is the number of headers to roll back the current chain + a fragment containing the new headers to add, i.e., a diff w.r.t. the current chain. Previously, we converted such a `ChainDiff` to a `ChainAndLedger`, i.e., a fragment starting from the immutable tip (typically containing >= k headers) + a ledger matching the tip. Now, we stick to the `ChainDiff` until the end, when we actually install the candidate as the new chain by applying the diff. Also introduce `ValidatedChainDiff` and use that instead of `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger` for the current chain. * Simplify `trySwitchTo` because there is no concurrency thanks to the queue introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`. * New trace messages: - `ChainSelectionForFutureBlock` - `CandidateContainsFutureBlocks` - `CandidateContainsFutureBlocksExceedingClockSkew` * Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really used and complicated our new handling of blocks from the future. * Don't mark successors of an invalid block as invalid, as this is redundant, see why in `ChainDB.md`. This means we remove the `InChainAfterInvalidBlock` constructor of `InvalidBlockReason`. * Introduce `ChainSelEnv` to reduce the number of parameters to pass around.
coot
pushed a commit
that referenced
this pull request
May 16, 2022
1709: ChainDB: add blocks asynchronously r=edsko a=mrBliss Fixes #1463. Instead of adding blocks synchronously, they are now put into a queue, after which `addBlockAsync` returns an `AddBlockResult`, which can be used to wait until the block has been processed. A background thread will read the blocks from the queue and add them synchronously to the ChainDB. The queue is limited in size; when it is full, callers of `addBlockAsync` might still have to wait. With this asynchronous approach, threads adding blocks asynchronously can be killed without worries, the background thread processing the blocks synchronously won't be killed. Only when the whole ChainDB shuts down will that background thread get killed. But since there will be no more in-memory state, it can't get out of sync with the file system state. On the next startup, a correct in-memory state will be reconstructed from the file system state. By letting the BlockFetchClient add blocks asynchronously, we also get a 20-40% bulk chain sync speed-up in some microbenchmarks. Co-authored-by: Thomas Winant <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #1463.
Instead of adding blocks synchronously, they are now put into a queue, after
which
addBlockAsync
returns anAddBlockResult
, which can be used to waituntil the block has been processed.
A background thread will read the blocks from the queue and add them
synchronously to the ChainDB. The queue is limited in size; when it is full,
callers of
addBlockAsync
might still have to wait.With this asynchronous approach, threads adding blocks asynchronously can be
killed without worries, the background thread processing the blocks
synchronously won't be killed. Only when the whole ChainDB shuts down will
that background thread get killed. But since there will be no more in-memory
state, it can't get out of sync with the file system state. On the next
startup, a correct in-memory state will be reconstructed from the file system
state.
By letting the BlockFetchClient add blocks asynchronously, we also get a
20-40% bulk chain sync speed-up in some microbenchmarks.