Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate EBBs more realistically in the RealPBFT consensus test #1353

Merged
merged 13 commits into from
Jan 6, 2020

Conversation

nfrisby
Copy link
Contributor

@nfrisby nfrisby commented Dec 14, 2019

Fixes #966. Fixes #1352.

I'm opening this as a Draft PR because I would like your feedback on the basic schemes before I polish it up. Basic schemes:

  • Issue Include mock EBBs #966 - Use a new Tracer in Node.Tracers and a new TVar in Test.Dynamic.Tracer to ensure that each node has added all EBBs to its ChainDB before forging a block when leading. Blocking the traced thread seems like an abuse of Tracers, but I haven't come up with a less invasive option. We also had to update preferCandidate to care about length-after-intersection in addition to block number, since it was just ignoring the EBBs we were adding: except the epoch 0 EBB, they'd only be selected once their successor arrived -- but that meant they never got a successor! That in turn required some gymnastics in BlockFetch to ensure we only compare chain fragments that intersect.

  • Nodes always forge an actual block when leading; they never waste their turn forging an EBB.

  • suchThat to preclude node join plans that cause unpredictable final chains. The ambiguity is currently limited to at most the first numCoreNodes blocks, but that is enough to cause trouble because it could affect the PBft signature threshold, which could subvert the "at least k blocks in 2k slots" invariant established by the RealPBFT NodeJoinPlan generator. (I tried to integrate this check into the recursive generator itself, but the inherent non-monotonicity spoils the inductive hypothesis that "at the very least the plan is OK if all remaining nodes ASAP".)

  • Issue ChainSync client cannot rollback to an EBB that shares a slot with an actual block #1352 - when the CS client rollsback a candidate fragment and the oldest dropped header has the same slot as the target, then the target is an EBB with an accompanying block in the same slot and so the primary chain state should essentially be rolled back to the preceding slot instead of the EBB's slot. Otherwise we'd retain the block sharing the slot, which the intersection point precedes. We address this by storing the header hash of an EBB in the chain state and taking a Point as the target of rewinds. We retain the EBB when rolling back to its slot if its hash matches that of the given Point. As an unfortunate compromise, we store the HeaderHash as a ByteString, since a sufficient argument for HeaderHash is not made available by the arguments to the ChainState family.

@nfrisby nfrisby added bug Something isn't working consensus issues related to ouroboros-consensus testing protocol testing labels Dec 14, 2019
@nfrisby nfrisby requested a review from mrBliss December 14, 2019 00:21
@nfrisby nfrisby force-pushed the nfrisby/issue-966-more-EBBs branch from 9fe3499 to 0d2c951 Compare December 14, 2019 01:25
@nfrisby
Copy link
Contributor Author

nfrisby commented Dec 16, 2019

I've pushed up three new commits.

  • b783ef6 - test-consensus: fixup PR 966 no EBBs more than 2k later

Without this, late joining nodes could forge multiple EBBs on top of Origin. They're 10k slots apart, which means the ">= k blocks in 2k slots" was violated.

  • 333ea01 - test-consensus: check for all expected EBBs

I noticed that I was only seeing EBBs for Epoch 0. So I added this test to confirm. This test fails without the next commit.

  • ad8199d - TENTATIVE consensus: PBFT preferCandidate no longer ignores EBBs f538901 - consensus: Add SelectEBBsPromptly protocol combinator and use for RealPBFT

I'm unfamiliar with the consequences here, but this override in OuroborosTag PBFT seems to suffice at first glance to allow us to select EBBs for later epochs.

  -- we override the default so that the ChainDB will select a new EBB
  -- /promptly/
  preferCandidate _ ours cand =
    case AF.compareHeadBlockNo ours cand of
      LT -> True
      EQ -> case AF.intersect ours cand of
        Nothing                   -> False
        Just (_, _, ours', cand') -> AF.length ours' < AF.length cand'
      GT -> False

I haven't formulated it crisply yet, but I think there's an invariant along the lines of "we only forge EBBs in the correct slots and every node forges EBBs", so the presence/absence of EBBs will never be a deciding factor in the tests.

I suspect this change should probably be isolated in a protocol combinator that the RealPBFT test uses. (Edit: Done via the pre-existing ModChainSel.)

@nfrisby nfrisby force-pushed the nfrisby/issue-966-more-EBBs branch 3 times, most recently from fbc67ec to 35dfbd4 Compare December 17, 2019 04:31
@nfrisby nfrisby force-pushed the nfrisby/issue-966-more-EBBs branch from 35dfbd4 to 11b5b3e Compare December 17, 2019 17:45
@mrBliss
Copy link
Contributor

mrBliss commented Dec 18, 2019

Edsko and I came up with a better plan:

Instead of computing the intersection and then the lengths of the suffices, we can instead simply check whether one of the headers is an EBB or not. We can use headerPBftFields :: cfg -> hdr -> Maybe .. for this, as it returns a Just for a regular header and Nothing for an EBB header. So more concretely:

-- Look at the tips
case (AF.head ours, AF.head cand) of
    -- Both empty
    (Left _,       Left _)        -> EQ
    -- Our chain is empty, candidate is not
    (Left _,       Right _)       -> LT
    -- Our chain is not empty, the candidate is
    (Right _,      Left _)        -> GT
    -- Our chain and candidate not empty
    (Right ourHdr, Right candHdr) ->
      -- Prefer the highest block number, as it is a proxy for chain length
      case blockNo ourHdr `compare` blockNo candHdr of
        LT -> LT
        GT -> GT
        -- If the block numbers are the same, check if one of them is an EBB.
        -- An EBB has the same block number as the block before it, so the
        -- chain ending with an EBB is actually longer than the one ending with a
        -- regular block. Note that `headerPBftFields` returns `Nothing` for
        -- an EBB.
        EQ ->
          case (headerPBftFields cfg ourHdr, headerPBftFields cfg candHdr) of
            (Just _,  Just _)  -> EQ
            (Just _,  Nothing) -> LT
            (Nothing, Nothing) -> EQ
            (Nothing, Just _)  -> GT

This only needs to be done for PBFT.

This also means that we don't need the extra invariants and don't have to change BlockFetch.

Furthermore, this is compatible with Edsko's plan to look only at the headers at the tip (#1227) instead of the fragments.

@nfrisby nfrisby force-pushed the nfrisby/issue-966-more-EBBs branch from 2e4d700 to 124ab0d Compare December 18, 2019 22:43
@nfrisby
Copy link
Contributor Author

nfrisby commented Dec 18, 2019

I've force-pushed some new work. Summary:

  • All tests pass (since I updated the expected result for the golden test for a serialised PBFT ChainState). These tests include exercising Issue 1352 (I removed the list comprehension guard that avoided it in the PBFT ChainState tests) and also confirm that the final chains can include EBBs for multiple epochs.

  • Commit 4ac8528 implements the new preferCandidates you and Edsko most recently suggested.

  • Commit 9195a44 implements the new PBFT ChainState scheme that allows rewinding to an EBB that shares a slot with a non-EBB (i.e. Issue 1352). This commit uses ByteString to workaround the fact that the ChainState type cannot depend on the block/header type.

I'm not excited about the ByteString-based approach, but:

  • It proves out the basic approach of tracking EBB's hashes in the ChainState in order to fix Issue 1352.

  • I used a heavily restricted newtype to ensure the HeaderHash is indeed what's being serialised and compared.

  • The alternative we last discussed was to use an existentially quantified headerHash type with a Typeable constraint. I spent a while trying different approaches with this but ultimately took the ByteString escape hatch since there were so many obstacles. In particular, I'm not sure we can deserialize the ChainState if it existentially quantifies one of the types that occurs in it. In other words, we need to know the HeaderHash type before deserializing the ChainState...

  • ... And if that's the case, then I suspect we might as well just go ahead add the block or hdr type parameter to the ChainState family itself. Then everything would work nicely and without needing existentials. For example RunMockProtocol would mostly be folded into RunMockBlock because ChainState p would become ChainState p (SimpleHeader c ext).

-- | The part of 'RunMock' that depends only on @p@
class RunMockProtocol p where
  mockProtocolMagicId  ::           NodeConfig p -> ProtocolMagicId
  mockEncodeChainState ::           NodeConfig p -> ChainState p -> Encoding
  mockDecodeChainState :: forall s. NodeConfig p -> Decoder s (ChainState p)

-- | Protocol specific functionality required to run consensus with mock blocks
class RunMockProtocol p => RunMockBlock p c ext where
  ...

@nfrisby nfrisby force-pushed the nfrisby/issue-966-more-EBBs branch from 124ab0d to 9195a44 Compare December 18, 2019 23:00
@mrBliss mrBliss mentioned this pull request Dec 19, 2019
@nfrisby nfrisby force-pushed the nfrisby/issue-966-more-EBBs branch 4 times, most recently from ea2bfdb to 4cb2df4 Compare December 20, 2019 03:35
@nfrisby
Copy link
Contributor Author

nfrisby commented Dec 20, 2019

The push ea2bfdb3...4cb2df4f is just squashing.

@nfrisby
Copy link
Contributor Author

nfrisby commented Dec 20, 2019

Commit 948503e2 9430dbc5 (edit: this one also updates golden tests) replaces the Map of EBB info in the chain state with just a Maybe.

@nfrisby nfrisby force-pushed the nfrisby/issue-966-more-EBBs branch 2 times, most recently from 9430dbc to 11a745c Compare December 20, 2019 03:46
Copy link
Contributor

@mrBliss mrBliss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Let's get this merged!

ouroboros-consensus/test-consensus/Test/Dynamic/Network.hs Outdated Show resolved Hide resolved
Comment on lines +427 to +431
TraceForgeAboutToLead s -> do
atomically $ do
lim <- readTVar nextEbbSlotVar
check $ s < lim
o -> traceWith (nodeEventsForges nodeInfoEvents) o
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we want to make EBBs optional? Or even stop producing after some epoch?
It doesn't have to be part of this PR, but it would nice to be able to do that in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If runNodeNetwork is invoked with NothingForgeEBB then there will be no EBBs, but this nextEbbSlotVar will still increment -- there's a comment saying as much at the binding site of nextEbbSlotVar. It's slightly off-putting, but seemed preferable to having multiple Maybes, etc.

No support for stopping EBBs at a certain slot yet.

@mrBliss mrBliss modified the milestones: S3 2020-01-02, S4 2020-01-16 Jan 6, 2020
@nfrisby nfrisby force-pushed the nfrisby/issue-966-more-EBBs branch from 53b6dee to 3b73e1c Compare January 6, 2020 14:38
@nfrisby nfrisby force-pushed the nfrisby/issue-966-more-EBBs branch from 3b73e1c to 41372f2 Compare January 6, 2020 14:45
@nfrisby
Copy link
Contributor Author

nfrisby commented Jan 6, 2020

Two rebases just now: first fixuped the minor suggested edits, and second rebased onto origin/master. We're so close! :)

@nfrisby nfrisby force-pushed the nfrisby/issue-966-more-EBBs branch from 41372f2 to 36c8ab2 Compare January 6, 2020 15:30
@nfrisby
Copy link
Contributor Author

nfrisby commented Jan 6, 2020

Another rebase: I exploded the address PR comments commit into fixups for appropriate preceding commits. I also confirmed that test-consensus successfully builds after each of the commits on this PR.

$ git diff nfrisby/issue-966-more-EBBs origin/nfrisby/issue-966-more-EBBs 
$ git push --force-with-lease origin nfrisby/issue-966-more-EBBs
Counting objects: 126, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (113/113), done.
Writing objects: 100% (126/126), 27.16 KiB | 0 bytes/s, done.
Total 126 (delta 73), reused 0 (delta 0)
remote: Resolving deltas: 100% (73/73), completed with 34 local objects.
To github.com-iohk:input-output-hk/ouroboros-network.git
 + 41372f2c...36c8ab23 nfrisby/issue-966-more-EBBs -> nfrisby/issue-966-more-EBBs (forced update)
$ 

@nfrisby
Copy link
Contributor Author

nfrisby commented Jan 6, 2020

bors r+

iohk-bors bot added a commit that referenced this pull request Jan 6, 2020
1353: Generate EBBs more realistically in the RealPBFT consensus test r=nfrisby a=nfrisby

Fixes #966. Fixes #1352.

I'm opening this as a Draft PR because I would like your feedback on the basic schemes before I polish it up. Basic schemes:

  * Issue #966 - Use a new `Tracer` in `Node.Tracers` and a new `TVar` in `Test.Dynamic.Tracer` to ensure that each node has added all EBBs to its `ChainDB` _before_ forging a block when leading. Blocking the traced thread seems like an abuse of `Tracer`s, but I haven't come up with a less invasive option. We also had to update `preferCandidate` to care about length-after-intersection in addition to block number, since it was just ignoring the EBBs we were adding: except the epoch 0 EBB, they'd only be selected once their successor arrived -- but that meant they never got a successor! That in turn required some gymnastics in BlockFetch to ensure we only compare chain fragments that intersect.

  * Nodes always forge an actual block when leading; they never waste their turn forging an EBB.

  * `suchThat` to preclude node join plans that cause unpredictable final chains. The ambiguity is currently limited to at most the first `numCoreNodes` blocks, but that is enough to cause trouble because it could affect the PBft signature threshold, which could subvert the "at least `k` blocks in `2k` slots" invariant established by the RealPBFT `NodeJoinPlan` generator. (I tried to integrate this check into the recursive generator itself, but the inherent non-monotonicity spoils the inductive hypothesis that "at the very least the plan is OK if all remaining nodes ASAP".)

  * Issue #1352 - when the CS client rollsback a candidate fragment and the oldest dropped header has the same slot as the target, then the target is an EBB with an accompanying block in the same slot and so the primary chain state should essentially be rolled back to the preceding slot instead of the EBB's slot. Otherwise we'd retain the block sharing the slot, which the intersection point precedes. We address this by storing the header hash of an EBB in the chain state and taking a `Point` as the target of rewinds. We retain the EBB when rolling back to its slot if its hash matches that of the given `Point`. As an unfortunate compromise, we store the `HeaderHash` as a `ByteString`, since a sufficient argument for `HeaderHash` is not made available by the arguments to the `ChainState` family.

Co-authored-by: Thomas Winant <[email protected]>
Co-authored-by: Nicolas Frisby <[email protected]>
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Jan 6, 2020

Build succeeded

@iohk-bors iohk-bors bot merged commit 36c8ab2 into master Jan 6, 2020
@iohk-bors iohk-bors bot deleted the nfrisby/issue-966-more-EBBs branch January 6, 2020 15:40
@nfrisby
Copy link
Contributor Author

nfrisby commented Jan 6, 2020

🎉🎉🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working byron Required for a Byron mainnet: replace the old core nodes with cardano-node consensus issues related to ouroboros-consensus testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ChainSync client cannot rollback to an EBB that shares a slot with an actual block Include mock EBBs
2 participants