Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky: Various failures due to PastHorizonException from DB #2372

Closed
Anviking opened this issue Dec 7, 2020 · 1 comment
Closed

Flaky: Various failures due to PastHorizonException from DB #2372

Anviking opened this issue Dec 7, 2020 · 1 comment
Assignees
Labels
Test failure A flaky test or nightly CI failure

Comments

@Anviking
Copy link
Member

Anviking commented Dec 7, 2020

Context

We saw a sharp uptick in failures when bumping the node to include the Allegra era. There were PastHorizonExceptions from the DB in many of the failures.

Slack thread: https://input-output-rnd.slack.com/archives/GBT05825V/p1607339108011300

Test Case

Multiple. A lot of the times: migration tests of big wallets.

Failure / Counter-example

First we se these in the normal test output:

PastHorizon {pastHorizonCallStack = [("runQuery",SrcLoc {srcLocPackage = "ouroboros-consensus-0.1.0.0-3JXm5ogVkRThJO19RF1Tj", srcLocModule = "Ouroboros.Consensus.HardFork.History.Qry", srcLocFile = "src/Ouroboros/Consensus/HardFork/History/Qry.hs", srcLocStartLine = 426, srcLocStartCol = 44, srcLocEndLine = 426, srcLocEndCol = 64}),("interpretQuery",SrcLoc {srcLocPackage = "cardano-wallet-core-2020.11.26-Eht2UPkACzpCQC56U01U3O", srcLocModule = "Cardano.Wallet.Primitive.Slotting", srcLocFile = "src/Cardano/Wallet/Primitive/Slotting.hs", srcLocStartLine = 368, srcLocStartCol = 15, srcLocEndLine = 368, srcLocEndCol = 34}),("interpretQuery",SrcLoc {srcLocPackage = "cardano-wallet-core-2020.11.26-Eht2UPkACzpCQC56U01U3O", srcLocModule = "Cardano.Wallet.DB.Sqlite", srcLocFile = "src/Cardano/Wallet/DB/Sqlite.hs", srcLocStartLine = 856, srcLocStartCol = 25, srcLocEndLine = 856, srcLocEndCol = 82})], pastHorizonExpression = Some (ELet (ERelSlotToEpoch (EAbsToRelSlot (ELit (SlotNo 1083)))) (\x0 -> EPair (ERelToAbsEpoch (EVar x0)) (ESnd (EVar x0)))), pastHorizonSummary = [EraSummary {eraStart = Bound {boundTime = RelativeTime 0s, boundSlot = SlotNo 0, boundEpoch = EpochNo 0}, eraEnd = EraEnd (Bound {boundTime = RelativeTime 20s, boundSlot = SlotNo 80, boundEpoch = EpochNo 1}), eraParams = EraParams {eraEpochSize = EpochSize 80, eraSlotLength = SlotLength 0.25s, eraSafeZone = StandardSafeZone 16}},EraSummary {eraStart = Bound {boundTime = RelativeTime 20s, boundSlot = SlotNo 80, boundEpoch = EpochNo 1}, eraEnd = EraEnd (Bound {boundTime = RelativeTime 60s, boundSlot = SlotNo 280, boundEpoch = EpochNo 2}), eraParams = EraParams {eraEpochSize = EpochSize 200, eraSlotLength = SlotLength 0.2s, eraSafeZone = StandardSafeZone 60}},EraSummary {eraStart = Bound {boundTime = RelativeTime 60s, boundSlot = SlotNo 280, boundEpoch = EpochNo 2}, eraEnd = EraEnd (Bound {boundTime = RelativeTime 220s, boundSlot = SlotNo 1080, boundEpoch = EpochNo 6}), eraParams = EraParams {eraEpochSize = EpochSize 200, eraSlotLength = SlotLength 0.2s, eraSafeZone = StandardSafeZone 60}}]}

and finally:

 src/Test/Integration/Scenario/API/Shelley/Migrations.hs:214:15:
  1) API Specifications, SHELLEY_MIGRATIONS, SHELLEY_MIGRATE_01_big_wallet -  migrate a big wallet requiring more than one tx
       expected: Status {statusCode = 202, statusMessage = "Accepted"}
        but got: Status {statusCode = 500, statusMessage = "Internal Server Error"}

       from the following response: Left (DecodeFailure "Something went wrong")

       While verifying (Status {statusCode = 500, statusMessage = "Internal Server Error"},Left (DecodeFailure "Something went wrong"))

  To rerun use: --match "/API Specifications/SHELLEY_MIGRATIONS/SHELLEY_MIGRATE_01_big_wallet -  migrate a big wallet requiring more than one tx/"

  src/Test/Integration/Scenario/API/Byron/Migrations.hs:277:15:
  2) API Specifications, BYRON_MIGRATIONS, BYRON_MIGRATE_01 -  migrate a big wallet requiring more than one tx
       expected: Status {statusCode = 202, statusMessage = "Accepted"}
        but got: Status {statusCode = 500, statusMessage = "Internal Server Error"}

       from the following response: Left (DecodeFailure "Something went wrong")

       While verifying (Status {statusCode = 500, statusMessage = "Internal Server Error"},Left (DecodeFailure "Something went wrong"))

  To rerun use: --match "/API Specifications/BYRON_MIGRATIONS/BYRON_MIGRATE_01 - 

Resolution

Two options:

  1. Lower test parallelism in CI (DONE)
  2. Ensure only one or a few chain-sync connections are used for all wallets

QA

@Anviking
Copy link
Member Author

Anviking commented Dec 9, 2020

Solved through

Lower test parallelism in CI

@Anviking Anviking closed this as completed Dec 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Test failure A flaky test or nightly CI failure
Projects
None yet
Development

No branches or pull requests

1 participant