-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky Test: STAKE_POOLS_JOIN_01rewards, Can join a pool, earn rewards and collect them #2415
Comments
Is this the same? https://buildkite.com/input-output-hk/cardano-wallet/builds/12747#e17ab417-994b-4af3-9186-09aec0959bc4
|
It seems likely. I have also had various stake pools tests fail like this locally. |
Aside from that error message, this test case is inherently flaky. If epoch rollover happens just after the rewards withdrawal transaction is submitted, then new rewards accrue before the test case can check that previous rewards were withdrawn. Symptom of this is:
|
2419: Fix flaky stake pools integration test r=rvl a=rvl ### Issue Number Flaky test #2415 ### Overview This test failure seemed to be happening quite a lot the other day. Now I can't reproduce it. I am pretty sure the error is from somewhere in `stakeDistribution` which is used by `listPools`. My guess of the cause is that rollback causes ledger queries to temporarily fail. (confirmed: ADP-647) The error was never logged, and the exception message was not part of the HTTP 503 response. 1. [x] Fix some minor test environment issues. 2. [x] If a ledger query fails, make sure that the exception is logged. 3. [x] Correct the return type of `listPools` and functions that it calls. 4. [x] Reproduce error with logs, diagnose problem. Problem 1: unreproduceable. Problem 2: epoch rollover between tx submission and rewards balance check. 5. [x] Fix the ~code and/or~ test case. 6. [x] Add utility functions for transforming BracketLog traces into time deltas and logging timings. 7. [x] Add back logging of how long ledger queries take. ### Comments - I haven't reproduced the error, but it's worth merging these fixes anyway. - The new error, which seems to happen reliably on the same test case, is this: ``` [1 2021-01-05 17:33:29] Failures: [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] src/Test/Integration/Scenario/API/Shelley/StakePools.hs:324:26: [1 2021-01-05 17:33:29] 1) API Specifications, SHELLEY_STAKE_POOLS, STAKE_POOLS_JOIN_01rewards - Can join a pool, earn rewards and collect them [1 2021-01-05 17:33:29] While verifying (Status {statusCode = 200, statusMessage = "OK"},Right (ApiWallet {id = ApiT {getApiT = WalletId {getWalletId = c83c7e41054d4f96238c00e1a8abf75c7a93f485}}, addressPoolGap = ApiT {getApiT = AddressPoolGap {getAddressPoolGap = 20}}, balance = ApiT {getApiT = WalletBalance {available = Quantity {getQuantity = 999998581353}, total = Quantity {getQuantity = 1000007589123}, reward = Quantity {getQuantity = 9007770}}}, delegation = ApiWalletDelegation {active = ApiWalletDelegationNext {status = Delegating, target = Just (ApiT {getApiT = PoolId {getPoolId = "\187\DC1L\179}u\250\ENQ&\ETX(\194\&5\163\218\226\149\163=\v\166t\165\235\RS>V\142"}}), changesAt = Nothing}, next = []}, name = ApiT {getApiT = WalletName {getWalletName = "Faucet Wallet"}}, passphrase = Just (ApiWalletPassphraseInfo {lastUpdatedAt = 2021-01-05 09:22:50.035321314 UTC}), state = ApiT {getApiT = Ready}, tip = ApiBlockReference {absoluteSlotNumber = ApiT {getApiT = SlotNo 5056}, slotId = ApiSlotId {epochNumber = ApiT {getApiT = EpochNo {unEpochNo = 101}}, slotNumber = ApiT {getApiT = SlotInEpoch {unSlotInEpoch = 6}}}, time = 2021-01-05 09:25:03.2 UTC, block = ApiBlockInfo {height = Quantity {getQuantity = 2533}}}})) [1 2021-01-05 17:33:29] Waited longer than 90s to resolve action: "Wallet has consumed rewards". [1 2021-01-05 17:33:29] expected: Quantity {getQuantity = 0} [1 2021-01-05 17:33:29] but got: Quantity {getQuantity = 9007770} [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] To rerun use: --match "/API Specifications/SHELLEY_STAKE_POOLS/STAKE_POOLS_JOIN_01rewards - Can join a pool, earn rewards and collect them/" [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] Randomized with seed 1938796546 [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] Finished in 1514.5249 seconds [1 2021-01-05 17:33:29] 711 examples, 1 failure, 9 pending [1 2021-01-05 17:33:29] NO_CLEANUP of temporary directory /run/user/1000/test-06a07e3d6c739d45 real 25m15.583s user 27m19.699s sys 2m15.754s ``` The cause of this failure (according to the logs) is epoch rollover happening just after the rewards withdrawal transaction is submitted. New rewards accrue before the test case can check that previous rewards were withdrawn. Inherently flaky. Co-authored-by: Rodney Lorrimar <[email protected]>
2419: Fix flaky stake pools integration test r=Anviking a=rvl ### Issue Number Flaky test #2415 ### Overview This test failure seemed to be happening quite a lot the other day. Now I can't reproduce it. I am pretty sure the error is from somewhere in `stakeDistribution` which is used by `listPools`. My guess of the cause is that rollback causes ledger queries to temporarily fail. (confirmed: ADP-647) The error was never logged, and the exception message was not part of the HTTP 503 response. 1. [x] Fix some minor test environment issues. 2. [x] If a ledger query fails, make sure that the exception is logged. 3. [x] Correct the return type of `listPools` and functions that it calls. 4. [x] Reproduce error with logs, diagnose problem. Problem 1: unreproduceable. Problem 2: epoch rollover between tx submission and rewards balance check. 5. [x] Fix the ~code and/or~ test case. 6. [x] Add utility functions for transforming BracketLog traces into time deltas and logging timings. 7. [x] Add back logging of how long ledger queries take. ### Comments - I haven't reproduced the error, but it's worth merging these fixes anyway. - The new error, which seems to happen reliably on the same test case, is this: ``` [1 2021-01-05 17:33:29] Failures: [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] src/Test/Integration/Scenario/API/Shelley/StakePools.hs:324:26: [1 2021-01-05 17:33:29] 1) API Specifications, SHELLEY_STAKE_POOLS, STAKE_POOLS_JOIN_01rewards - Can join a pool, earn rewards and collect them [1 2021-01-05 17:33:29] While verifying (Status {statusCode = 200, statusMessage = "OK"},Right (ApiWallet {id = ApiT {getApiT = WalletId {getWalletId = c83c7e41054d4f96238c00e1a8abf75c7a93f485}}, addressPoolGap = ApiT {getApiT = AddressPoolGap {getAddressPoolGap = 20}}, balance = ApiT {getApiT = WalletBalance {available = Quantity {getQuantity = 999998581353}, total = Quantity {getQuantity = 1000007589123}, reward = Quantity {getQuantity = 9007770}}}, delegation = ApiWalletDelegation {active = ApiWalletDelegationNext {status = Delegating, target = Just (ApiT {getApiT = PoolId {getPoolId = "\187\DC1L\179}u\250\ENQ&\ETX(\194\&5\163\218\226\149\163=\v\166t\165\235\RS>V\142"}}), changesAt = Nothing}, next = []}, name = ApiT {getApiT = WalletName {getWalletName = "Faucet Wallet"}}, passphrase = Just (ApiWalletPassphraseInfo {lastUpdatedAt = 2021-01-05 09:22:50.035321314 UTC}), state = ApiT {getApiT = Ready}, tip = ApiBlockReference {absoluteSlotNumber = ApiT {getApiT = SlotNo 5056}, slotId = ApiSlotId {epochNumber = ApiT {getApiT = EpochNo {unEpochNo = 101}}, slotNumber = ApiT {getApiT = SlotInEpoch {unSlotInEpoch = 6}}}, time = 2021-01-05 09:25:03.2 UTC, block = ApiBlockInfo {height = Quantity {getQuantity = 2533}}}})) [1 2021-01-05 17:33:29] Waited longer than 90s to resolve action: "Wallet has consumed rewards". [1 2021-01-05 17:33:29] expected: Quantity {getQuantity = 0} [1 2021-01-05 17:33:29] but got: Quantity {getQuantity = 9007770} [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] To rerun use: --match "/API Specifications/SHELLEY_STAKE_POOLS/STAKE_POOLS_JOIN_01rewards - Can join a pool, earn rewards and collect them/" [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] Randomized with seed 1938796546 [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] Finished in 1514.5249 seconds [1 2021-01-05 17:33:29] 711 examples, 1 failure, 9 pending [1 2021-01-05 17:33:29] NO_CLEANUP of temporary directory /run/user/1000/test-06a07e3d6c739d45 real 25m15.583s user 27m19.699s sys 2m15.754s ``` The cause of this failure (according to the logs) is epoch rollover happening just after the rewards withdrawal transaction is submitted. New rewards accrue before the test case can check that previous rewards were withdrawn. Inherently flaky. Co-authored-by: Rodney Lorrimar <[email protected]>
2419: Fix flaky stake pools integration test r=Anviking a=rvl ### Issue Number Flaky test #2415 ### Overview This test failure seemed to be happening quite a lot the other day. Now I can't reproduce it. I am pretty sure the error is from somewhere in `stakeDistribution` which is used by `listPools`. My guess of the cause is that rollback causes ledger queries to temporarily fail. (confirmed: ADP-647) The error was never logged, and the exception message was not part of the HTTP 503 response. 1. [x] Fix some minor test environment issues. 2. [x] If a ledger query fails, make sure that the exception is logged. 3. [x] Correct the return type of `listPools` and functions that it calls. 4. [x] Reproduce error with logs, diagnose problem. Problem 1: unreproduceable. Problem 2: epoch rollover between tx submission and rewards balance check. 5. [x] Fix the ~code and/or~ test case. 6. [x] Add utility functions for transforming BracketLog traces into time deltas and logging timings. 7. [x] Add back logging of how long ledger queries take. ### Comments - I haven't reproduced the error, but it's worth merging these fixes anyway. - The new error, which seems to happen reliably on the same test case, is this: ``` [1 2021-01-05 17:33:29] Failures: [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] src/Test/Integration/Scenario/API/Shelley/StakePools.hs:324:26: [1 2021-01-05 17:33:29] 1) API Specifications, SHELLEY_STAKE_POOLS, STAKE_POOLS_JOIN_01rewards - Can join a pool, earn rewards and collect them [1 2021-01-05 17:33:29] While verifying (Status {statusCode = 200, statusMessage = "OK"},Right (ApiWallet {id = ApiT {getApiT = WalletId {getWalletId = c83c7e41054d4f96238c00e1a8abf75c7a93f485}}, addressPoolGap = ApiT {getApiT = AddressPoolGap {getAddressPoolGap = 20}}, balance = ApiT {getApiT = WalletBalance {available = Quantity {getQuantity = 999998581353}, total = Quantity {getQuantity = 1000007589123}, reward = Quantity {getQuantity = 9007770}}}, delegation = ApiWalletDelegation {active = ApiWalletDelegationNext {status = Delegating, target = Just (ApiT {getApiT = PoolId {getPoolId = "\187\DC1L\179}u\250\ENQ&\ETX(\194\&5\163\218\226\149\163=\v\166t\165\235\RS>V\142"}}), changesAt = Nothing}, next = []}, name = ApiT {getApiT = WalletName {getWalletName = "Faucet Wallet"}}, passphrase = Just (ApiWalletPassphraseInfo {lastUpdatedAt = 2021-01-05 09:22:50.035321314 UTC}), state = ApiT {getApiT = Ready}, tip = ApiBlockReference {absoluteSlotNumber = ApiT {getApiT = SlotNo 5056}, slotId = ApiSlotId {epochNumber = ApiT {getApiT = EpochNo {unEpochNo = 101}}, slotNumber = ApiT {getApiT = SlotInEpoch {unSlotInEpoch = 6}}}, time = 2021-01-05 09:25:03.2 UTC, block = ApiBlockInfo {height = Quantity {getQuantity = 2533}}}})) [1 2021-01-05 17:33:29] Waited longer than 90s to resolve action: "Wallet has consumed rewards". [1 2021-01-05 17:33:29] expected: Quantity {getQuantity = 0} [1 2021-01-05 17:33:29] but got: Quantity {getQuantity = 9007770} [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] To rerun use: --match "/API Specifications/SHELLEY_STAKE_POOLS/STAKE_POOLS_JOIN_01rewards - Can join a pool, earn rewards and collect them/" [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] Randomized with seed 1938796546 [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] Finished in 1514.5249 seconds [1 2021-01-05 17:33:29] 711 examples, 1 failure, 9 pending [1 2021-01-05 17:33:29] NO_CLEANUP of temporary directory /run/user/1000/test-06a07e3d6c739d45 real 25m15.583s user 27m19.699s sys 2m15.754s ``` The cause of this failure (according to the logs) is epoch rollover happening just after the rewards withdrawal transaction is submitted. New rewards accrue before the test case can check that previous rewards were withdrawn. Inherently flaky. Co-authored-by: Rodney Lorrimar <[email protected]>
2419: Fix flaky stake pools integration test r=Anviking a=rvl ### Issue Number Flaky test #2415 ### Overview This test failure seemed to be happening quite a lot the other day. Now I can't reproduce it. I am pretty sure the error is from somewhere in `stakeDistribution` which is used by `listPools`. My guess of the cause is that rollback causes ledger queries to temporarily fail. (confirmed: ADP-647) The error was never logged, and the exception message was not part of the HTTP 503 response. 1. [x] Fix some minor test environment issues. 2. [x] If a ledger query fails, make sure that the exception is logged. 3. [x] Correct the return type of `listPools` and functions that it calls. 4. [x] Reproduce error with logs, diagnose problem. Problem 1: unreproduceable. Problem 2: epoch rollover between tx submission and rewards balance check. 5. [x] Fix the ~code and/or~ test case. 6. [x] Add utility functions for transforming BracketLog traces into time deltas and logging timings. 7. [x] Add back logging of how long ledger queries take. ### Comments - I haven't reproduced the error, but it's worth merging these fixes anyway. - The new error, which seems to happen reliably on the same test case, is this: ``` [1 2021-01-05 17:33:29] Failures: [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] src/Test/Integration/Scenario/API/Shelley/StakePools.hs:324:26: [1 2021-01-05 17:33:29] 1) API Specifications, SHELLEY_STAKE_POOLS, STAKE_POOLS_JOIN_01rewards - Can join a pool, earn rewards and collect them [1 2021-01-05 17:33:29] While verifying (Status {statusCode = 200, statusMessage = "OK"},Right (ApiWallet {id = ApiT {getApiT = WalletId {getWalletId = c83c7e41054d4f96238c00e1a8abf75c7a93f485}}, addressPoolGap = ApiT {getApiT = AddressPoolGap {getAddressPoolGap = 20}}, balance = ApiT {getApiT = WalletBalance {available = Quantity {getQuantity = 999998581353}, total = Quantity {getQuantity = 1000007589123}, reward = Quantity {getQuantity = 9007770}}}, delegation = ApiWalletDelegation {active = ApiWalletDelegationNext {status = Delegating, target = Just (ApiT {getApiT = PoolId {getPoolId = "\187\DC1L\179}u\250\ENQ&\ETX(\194\&5\163\218\226\149\163=\v\166t\165\235\RS>V\142"}}), changesAt = Nothing}, next = []}, name = ApiT {getApiT = WalletName {getWalletName = "Faucet Wallet"}}, passphrase = Just (ApiWalletPassphraseInfo {lastUpdatedAt = 2021-01-05 09:22:50.035321314 UTC}), state = ApiT {getApiT = Ready}, tip = ApiBlockReference {absoluteSlotNumber = ApiT {getApiT = SlotNo 5056}, slotId = ApiSlotId {epochNumber = ApiT {getApiT = EpochNo {unEpochNo = 101}}, slotNumber = ApiT {getApiT = SlotInEpoch {unSlotInEpoch = 6}}}, time = 2021-01-05 09:25:03.2 UTC, block = ApiBlockInfo {height = Quantity {getQuantity = 2533}}}})) [1 2021-01-05 17:33:29] Waited longer than 90s to resolve action: "Wallet has consumed rewards". [1 2021-01-05 17:33:29] expected: Quantity {getQuantity = 0} [1 2021-01-05 17:33:29] but got: Quantity {getQuantity = 9007770} [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] To rerun use: --match "/API Specifications/SHELLEY_STAKE_POOLS/STAKE_POOLS_JOIN_01rewards - Can join a pool, earn rewards and collect them/" [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] Randomized with seed 1938796546 [1 2021-01-05 17:33:29] [1 2021-01-05 17:33:29] Finished in 1514.5249 seconds [1 2021-01-05 17:33:29] 711 examples, 1 failure, 9 pending [1 2021-01-05 17:33:29] NO_CLEANUP of temporary directory /run/user/1000/test-06a07e3d6c739d45 real 25m15.583s user 27m19.699s sys 2m15.754s ``` The cause of this failure (according to the logs) is epoch rollover happening just after the rewards withdrawal transaction is submitted. New rewards accrue before the test case can check that previous rewards were withdrawn. Inherently flaky. Co-authored-by: Rodney Lorrimar <[email protected]>
I see
locally a lot now |
I think this new failure may be because I added a That changes the start time of the later test case Then I think these two lines are in the wrong order:
and/or the expected value |
I think it shouldn't make a difference actually. 🤔
It is a bit awkward, but they are OK I think. It might as well be:
and then I'm seeing this failure also locally though. |
Btw, I believe I saw it
|
Should hopefully be fixed by #2698 Can re-open otherwise. Note: the issue in the ticket description is technically different from all recently logger errors, but looking past it as it's really old. |
Context
Test Case
STAKE_POOLS_JOIN_01rewards
Failure / Counter-example
The text was updated successfully, but these errors were encountered: