Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChainDB: add blocks asynchronously #1709

Closed
wants to merge 3 commits into from

Conversation

mrBliss
Copy link
Contributor

@mrBliss mrBliss commented Feb 27, 2020

Fixes #1463.

Instead of adding blocks synchronously, they are now put into a queue, after
which addBlockAsync returns an AddBlockResult, which can be used to wait
until the block has been processed.

A background thread will read the blocks from the queue and add them
synchronously to the ChainDB. The queue is limited in size; when it is full,
callers of addBlockAsync might still have to wait.

With this asynchronous approach, threads adding blocks asynchronously can be
killed without worries, the background thread processing the blocks
synchronously won't be killed. Only when the whole ChainDB shuts down will
that background thread get killed. But since there will be no more in-memory
state, it can't get out of sync with the file system state. On the next
startup, a correct in-memory state will be reconstructed from the file system
state.

By letting the BlockFetchClient add blocks asynchronously, we also get a
20-40% bulk chain sync speed-up in some microbenchmarks.

See `lengthTBQueueDefault` and `MonadSTMTxExtended` for more information.
@mrBliss mrBliss added the consensus issues related to ouroboros-consensus label Feb 27, 2020
@mrBliss mrBliss requested review from edsko and dcoutts February 27, 2020 12:39
Fixes #1463.

Instead of adding blocks synchronously, they are now put into a queue, after
which `addBlockAsync` returns an `AddBlockPromise`, which can be used to wait
until the block has been processed.

A background thread will read the blocks from the queue and add them
synchronously to the ChainDB. The queue is limited in size; when it is full,
callers of `addBlockAsync` might still have to wait.

With this asynchronous approach, threads adding blocks asynchronously can be
killed without worries, the background thread processing the blocks
synchronously won't be killed. Only when the whole ChainDB shuts down will
that background thread get killed. But since there will be no more in-memory
state, it can't get out of sync with the file system state. On the next
startup, a correct in-memory state will be reconstructed from the file system
state.

By letting the BlockFetchClient add blocks asynchronously, we also get a
20-40% bulk chain sync speed-up in some microbenchmarks.
Copy link
Contributor

@edsko edsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed over hangout (made comments off-line as GitHub was having difficulties).

@edsko
Copy link
Contributor

edsko commented Feb 27, 2020

bors r+

iohk-bors bot added a commit that referenced this pull request Feb 27, 2020
1709: ChainDB: add blocks asynchronously r=edsko a=mrBliss

Fixes #1463.

Instead of adding blocks synchronously, they are now put into a queue, after
which `addBlockAsync` returns an `AddBlockResult`, which can be used to wait
until the block has been processed.

A background thread will read the blocks from the queue and add them
synchronously to the ChainDB. The queue is limited in size; when it is full,
callers of `addBlockAsync` might still have to wait.

With this asynchronous approach, threads adding blocks asynchronously can be
killed without worries, the background thread processing the blocks
synchronously won't be killed. Only when the whole ChainDB shuts down will
that background thread get killed. But since there will be no more in-memory
state, it can't get out of sync with the file system state. On the next
startup, a correct in-memory state will be reconstructed from the file system
state.

By letting the BlockFetchClient add blocks asynchronously, we also get a
20-40% bulk chain sync speed-up in some microbenchmarks.

Co-authored-by: Thomas Winant <[email protected]>
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Feb 27, 2020

@mrBliss
Copy link
Contributor Author

mrBliss commented Feb 27, 2020

This PR actually got merged, but because of GitHub's recent reliability problems, this PR is not aware of that.

@mrBliss mrBliss closed this Feb 27, 2020
@mrBliss mrBliss deleted the mrBliss/chaindb-async-addblock branch February 27, 2020 17:40
mrBliss added a commit that referenced this pull request Apr 29, 2020
Previously, we knew the current slot and were able to tell that a block was
from the future by comparing the block's slot against the current slot. For
such blocks we would schedule a chain selection at the block's slot, which
would be performed by a background thread.

Now, we no longer know the current slot. Instead, we validate candidate chains
and use the resulting ledgers to call `CheckInFuture`, which returns the
headers in the candidate fragment that are from the future. We truncate these
headers from the fragment, record that they're from the future
(`cdbFutureBlocks`), and repeat chain selection without them. Headers that are
too far from the future, i.e., exceeding the max clock skew, are recorded as
invalid blocks (with `InFutureExceedsClockSkew` as the `InvalidBlockReason`).

For each new block we receive, we perform chain selection for all future
blocks before performing chain selection for the new block.

* Split off `CandidateSuffix` into a separate module and use it throughout
  chain selection instead of only partially. A `CandidateSuffix` is the
  number of headers to roll back the current chain + a fragment containing the
  new headers to add, like a diff w.r.t. the current chain. Previously, we
  converted such a `CandidateSuffix` to a `ChainAndLedger`, i.e., a fragment
  starting from the immutable tip (typically containing >= k headers) + a
  ledger matching the tip. Now, we stick to the `CandidateSuffix` until the
  end, when we actually install the candidate as the new chain by applying the
  diff. Also introduce `CandidateSuffixAndLedger` and use that instead of
  `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger`
  for the current chain.

* Simplify `trySwitchTo` because there is no concurrency thanks to the queue
  introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`.

* New trace messages:
  - `ChainSelectionForFutureBlock`
  - `CandidateContainsFutureBlocks`
  - `CandidateContainsFutureBlocksExceedingClockSkew`

* Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really
  used and complicated our new handling of blocks from the future.
mrBliss added a commit that referenced this pull request Apr 30, 2020
Previously, we knew the current slot and were able to tell that a block was
from the future by comparing the block's slot against the current slot. For
such blocks we would schedule a chain selection at the block's slot, which
would be performed by a background thread.

Now, we no longer know the current slot. Instead, we validate candidate chains
and use the resulting ledgers to call `CheckInFuture`, which returns the
headers in the candidate fragment that are from the future. We truncate these
headers from the fragment, record that they're from the future
(`cdbFutureBlocks`), and repeat chain selection without them. Headers that are
too far into the future, i.e., exceeding the max clock skew, are not recorded
in `cdbFutureBlocks`, but are recorded as invalid blocks (with
`InFutureExceedsClockSkew` as the `InvalidBlockReason`).

For each new block we receive, we perform chain selection for all future
blocks before performing chain selection for the new block.

* Rename `CandidateSuffix` to `ChainDiff`, split it off into a separate
  module, and use it throughout chain selection instead of only partially. A
  `ChainDiff` is the number of headers to roll back the current chain + a
  fragment containing the new headers to add, i.e., a diff w.r.t. the current
  chain. Previously, we converted such a `ChainDiff` to a `ChainAndLedger`,
  i.e., a fragment starting from the immutable tip (typically containing >= k
  headers) + a ledger matching the tip. Now, we stick to the `ChainDiff` until
  the end, when we actually install the candidate as the new chain by applying
  the diff. Also introduce `ValidatedChainDiff` and use that instead of
  `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger`
  for the current chain.

* Simplify `trySwitchTo` because there is no concurrency thanks to the queue
  introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`.

* New trace messages:
  - `ChainSelectionForFutureBlock`
  - `CandidateContainsFutureBlocks`
  - `CandidateContainsFutureBlocksExceedingClockSkew`

* Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really
  used and complicated our new handling of blocks from the future.

* Don't mark successors of an invalid block as invalid, as this is redundant,
  see why in `ChainDB.md`. This means we remove the `InChainAfterInvalidBlock`
  constructor of `InvalidBlockReason`.

* Introduce `ChainSelEnv` to reduce the number of parameters to pass around.
mrBliss added a commit that referenced this pull request Apr 30, 2020
Previously, we knew the current slot and were able to tell that a block was
from the future by comparing the block's slot against the current slot. For
such blocks we would schedule a chain selection at the block's slot, which
would be performed by a background thread.

Now, we no longer know the current slot. Instead, we validate candidate chains
and use the resulting ledgers to call `CheckInFuture`, which returns the
headers in the candidate fragment that are from the future. We truncate these
headers from the fragment, record that they're from the future
(`cdbFutureBlocks`), and repeat chain selection without them. Headers that are
too far into the future, i.e., exceeding the max clock skew, are not recorded
in `cdbFutureBlocks`, but are recorded as invalid blocks (with
`InFutureExceedsClockSkew` as the `InvalidBlockReason`).

For each new block we receive, we perform chain selection for all future
blocks before performing chain selection for the new block.

* Rename `CandidateSuffix` to `ChainDiff`, split it off into a separate
  module, and use it throughout chain selection instead of only partially. A
  `ChainDiff` is the number of headers to roll back the current chain + a
  fragment containing the new headers to add, i.e., a diff w.r.t. the current
  chain. Previously, we converted such a `ChainDiff` to a `ChainAndLedger`,
  i.e., a fragment starting from the immutable tip (typically containing >= k
  headers) + a ledger matching the tip. Now, we stick to the `ChainDiff` until
  the end, when we actually install the candidate as the new chain by applying
  the diff. Also introduce `ValidatedChainDiff` and use that instead of
  `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger`
  for the current chain.

* Simplify `trySwitchTo` because there is no concurrency thanks to the queue
  introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`.

* New trace messages:
  - `ChainSelectionForFutureBlock`
  - `CandidateContainsFutureBlocks`
  - `CandidateContainsFutureBlocksExceedingClockSkew`

* Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really
  used and complicated our new handling of blocks from the future.

* Don't mark successors of an invalid block as invalid, as this is redundant,
  see why in `ChainDB.md`. This means we remove the `InChainAfterInvalidBlock`
  constructor of `InvalidBlockReason`.

* Introduce `ChainSelEnv` to reduce the number of parameters to pass around.
mrBliss added a commit that referenced this pull request Apr 30, 2020
Previously, we knew the current slot and were able to tell that a block was
from the future by comparing the block's slot against the current slot. For
such blocks we would schedule a chain selection at the block's slot, which
would be performed by a background thread.

Now, we no longer know the current slot. Instead, we validate candidate chains
and use the resulting ledgers to call `CheckInFuture`, which returns the
headers in the candidate fragment that are from the future. We truncate these
headers from the fragment, record that they're from the future
(`cdbFutureBlocks`), and repeat chain selection without them. Headers that are
too far into the future, i.e., exceeding the max clock skew, are not recorded
in `cdbFutureBlocks`, but are recorded as invalid blocks (with
`InFutureExceedsClockSkew` as the `InvalidBlockReason`).

For each new block we receive, we perform chain selection for all future
blocks before performing chain selection for the new block.

* Rename `CandidateSuffix` to `ChainDiff`, split it off into a separate
  module, and use it throughout chain selection instead of only partially. A
  `ChainDiff` is the number of headers to roll back the current chain + a
  fragment containing the new headers to add, i.e., a diff w.r.t. the current
  chain. Previously, we converted such a `ChainDiff` to a `ChainAndLedger`,
  i.e., a fragment starting from the immutable tip (typically containing >= k
  headers) + a ledger matching the tip. Now, we stick to the `ChainDiff` until
  the end, when we actually install the candidate as the new chain by applying
  the diff. Also introduce `ValidatedChainDiff` and use that instead of
  `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger`
  for the current chain.

* Simplify `trySwitchTo` because there is no concurrency thanks to the queue
  introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`.

* New trace messages:
  - `ChainSelectionForFutureBlock`
  - `CandidateContainsFutureBlocks`
  - `CandidateContainsFutureBlocksExceedingClockSkew`

* Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really
  used and complicated our new handling of blocks from the future.

* Don't mark successors of an invalid block as invalid, as this is redundant,
  see why in `ChainDB.md`. This means we remove the `InChainAfterInvalidBlock`
  constructor of `InvalidBlockReason`.

* Introduce `ChainSelEnv` to reduce the number of parameters to pass around.
mrBliss added a commit that referenced this pull request Apr 30, 2020
Previously, we knew the current slot and were able to tell that a block was
from the future by comparing the block's slot against the current slot. For
such blocks we would schedule a chain selection at the block's slot, which
would be performed by a background thread.

Now, we no longer know the current slot. Instead, we validate candidate chains
and use the resulting ledgers to call `CheckInFuture`, which returns the
headers in the candidate fragment that are from the future. We truncate these
headers from the fragment, record that they're from the future
(`cdbFutureBlocks`), and repeat chain selection without them. Headers that are
too far into the future, i.e., exceeding the max clock skew, are not recorded
in `cdbFutureBlocks`, but are recorded as invalid blocks (with
`InFutureExceedsClockSkew` as the `InvalidBlockReason`).

For each new block we receive, we perform chain selection for all future
blocks before performing chain selection for the new block.

* Rename `CandidateSuffix` to `ChainDiff`, split it off into a separate
  module, and use it throughout chain selection instead of only partially. A
  `ChainDiff` is the number of headers to roll back the current chain + a
  fragment containing the new headers to add, i.e., a diff w.r.t. the current
  chain. Previously, we converted such a `ChainDiff` to a `ChainAndLedger`,
  i.e., a fragment starting from the immutable tip (typically containing >= k
  headers) + a ledger matching the tip. Now, we stick to the `ChainDiff` until
  the end, when we actually install the candidate as the new chain by applying
  the diff. Also introduce `ValidatedChainDiff` and use that instead of
  `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger`
  for the current chain.

* Simplify `trySwitchTo` because there is no concurrency thanks to the queue
  introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`.

* New trace messages:
  - `ChainSelectionForFutureBlock`
  - `CandidateContainsFutureBlocks`
  - `CandidateContainsFutureBlocksExceedingClockSkew`

* Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really
  used and complicated our new handling of blocks from the future.

* Don't mark successors of an invalid block as invalid, as this is redundant,
  see why in `ChainDB.md`. This means we remove the `InChainAfterInvalidBlock`
  constructor of `InvalidBlockReason`.

* Introduce `ChainSelEnv` to reduce the number of parameters to pass around.
edsko pushed a commit that referenced this pull request May 1, 2020
Previously, we knew the current slot and were able to tell that a block was
from the future by comparing the block's slot against the current slot. For
such blocks we would schedule a chain selection at the block's slot, which
would be performed by a background thread.

Now, we no longer know the current slot. Instead, we validate candidate chains
and use the resulting ledgers to call `CheckInFuture`, which returns the
headers in the candidate fragment that are from the future. We truncate these
headers from the fragment, record that they're from the future
(`cdbFutureBlocks`), and repeat chain selection without them. Headers that are
too far into the future, i.e., exceeding the max clock skew, are not recorded
in `cdbFutureBlocks`, but are recorded as invalid blocks (with
`InFutureExceedsClockSkew` as the `InvalidBlockReason`).

For each new block we receive, we perform chain selection for all future
blocks before performing chain selection for the new block.

* Rename `CandidateSuffix` to `ChainDiff`, split it off into a separate
  module, and use it throughout chain selection instead of only partially. A
  `ChainDiff` is the number of headers to roll back the current chain + a
  fragment containing the new headers to add, i.e., a diff w.r.t. the current
  chain. Previously, we converted such a `ChainDiff` to a `ChainAndLedger`,
  i.e., a fragment starting from the immutable tip (typically containing >= k
  headers) + a ledger matching the tip. Now, we stick to the `ChainDiff` until
  the end, when we actually install the candidate as the new chain by applying
  the diff. Also introduce `ValidatedChainDiff` and use that instead of
  `ChainAndLedger` for the validated candidate. We still use `ChainAndLedger`
  for the current chain.

* Simplify `trySwitchTo` because there is no concurrency thanks to the queue
  introduced in #1709. Remove the obsolete trace message `ChainChangedInBg`.

* New trace messages:
  - `ChainSelectionForFutureBlock`
  - `CandidateContainsFutureBlocks`
  - `CandidateContainsFutureBlocksExceedingClockSkew`

* Remove `chainSelectionPerformed` from `AddBlockPromise` as it was not really
  used and complicated our new handling of blocks from the future.

* Don't mark successors of an invalid block as invalid, as this is redundant,
  see why in `ChainDB.md`. This means we remove the `InChainAfterInvalidBlock`
  constructor of `InvalidBlockReason`.

* Introduce `ChainSelEnv` to reduce the number of parameters to pass around.
coot pushed a commit that referenced this pull request May 16, 2022
1709: ChainDB: add blocks asynchronously r=edsko a=mrBliss

Fixes #1463.

Instead of adding blocks synchronously, they are now put into a queue, after
which `addBlockAsync` returns an `AddBlockResult`, which can be used to wait
until the block has been processed.

A background thread will read the blocks from the queue and add them
synchronously to the ChainDB. The queue is limited in size; when it is full,
callers of `addBlockAsync` might still have to wait.

With this asynchronous approach, threads adding blocks asynchronously can be
killed without worries, the background thread processing the blocks
synchronously won't be killed. Only when the whole ChainDB shuts down will
that background thread get killed. But since there will be no more in-memory
state, it can't get out of sync with the file system state. On the next
startup, a correct in-memory state will be reconstructed from the file system
state.

By letting the BlockFetchClient add blocks asynchronously, we also get a
20-40% bulk chain sync speed-up in some microbenchmarks.

Co-authored-by: Thomas Winant <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consensus issues related to ouroboros-consensus
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Node may currently incorrectly reported that a block was not adopted
2 participants