nomination-pools try-state hook fails on Kusama #158

liamaharon · 2023-07-02T10:34:47Z

Description of bug

A try-state check is failing on Kusama

Failing invariant:

https://github.com/paritytech/substrate/blob/9e8b2f63fb57c5d0fd45da1478508a4509d530e6/frame/nomination-pools/src/lib.rs#L3135-L3139

Polkadot also has a failing invariant:

https://github.com/paritytech/substrate/blob/9e8b2f63fb57c5d0fd45da1478508a4509d530e6/frame/collective/src/lib.rs#L1038-L1041

Reproduce

Clone polkadot and substrate into the same dir
Sync Kusama (or use a remote node on the next step)
Execute

cd ../polkadot && cargo build -r -p kusama-runtime --features try-runtime && cd ../substrate && cargo run -r --features=try-runtime try-runtime --runtime ../polkadot/target/release/wbuild/kusama-runtime/kusama_runtime.wasm execute-block live --uri ws://localhost:9944

TODO

also check polkadot/westend/rococo

The text was updated successfully, but these errors were encountered:

liamaharon · 2023-07-02T10:35:20Z

cc @Szegoo

ggwpez · 2023-07-02T21:31:09Z

I think ED was actually changed 10x at some point recently. I think this one paritytech/polkadot#6372, but it needs a bit closer investigation first. @rossbulat

rossbulat · 2023-07-03T11:02:13Z

I think ED was actually changed 10x at some point recently. I think this one paritytech/polkadot#6372, but it needs a bit closer investigation first. @rossbulat

So the hypothesis is that because ED on Kusama has increased 10x, pool reward accounts are now reserving more balance and therefore the actual pending rewards balance is less than the current reward counter?

if this is correct, I imagine an OpenGov referendum to inject balance from the treasury into each pool would fix this?

Reached out to @liamaharon, easiest way to check this (i think) would be to send a small amount of KSM to pool 40's reward pool (first one on logs), re-run try-state, & verify pool 40 no longer contains the warning.

liamaharon · 2023-07-03T13:24:22Z

Verified your transfer of 0.1 KSM fixed nom pool 40 @rossbulat.

We could fix this by OpenGov or in a runtime upgrade, either is fine I think.

rossbulat · 2023-07-03T18:43:07Z

Verified your transfer of 0.1 KSM fixed nom pool 40 @rossbulat.

We could fix this by OpenGov or in a runtime upgrade, either is fine I think.

I sent all the reward pools the ED amount: https://kusama.subscan.io/extrinsic/0x74dcbb20df2d8e2548cbe93d251ed290259506d146c145d04841fdea7b6bff7b

Hopefully this will resolve the runtime errors, please let me know.

liamaharon · 2023-07-04T16:28:55Z

Almost! Looks like pool 1 still needs a few planks

rossbulat · 2023-07-04T17:01:35Z

Ah perhaps that once was missed, just topped it up.

liamaharon · 2023-07-06T12:38:26Z

Hmm, both pending rewards and remaining balance have decreased since my last screenshot (and your top up) and the pending are > remaining balance again. Any ideas what may be going on?

kianenigma · 2023-07-27T17:05:50Z

The underlying issue, if real and something beyond the change to ED should be looked into by @paritytech/staking-core.

Ank4n · 2023-07-27T19:03:46Z

For the hypothesis that this might be due to ED increase, if we add 300_000_000 or 0.0003 KSM (the increase in ED) to the reward account, the remaining should go above the pending rewards.

Ross's top up should have fixed this. Funny thing is the difference between pending rewards and remaining balance comes up to 299999712 which is very close to theED increase and would perfectly satisfy our assumption if not for the Ross's topup. I will have a deeper look into this.

Ank4n · 2023-08-09T16:29:31Z

I have reproduced this issue in a test here. Would appreciate if someone can review this and make sure everything looks good and makes sense.

Why this happened?

Pending reward calculation of a delegator depends on the current_reward_counter of the pool. Due to a bug in the logic (affected by ED increase), the pool thinks it has not paid some rewards (equal or less than the ED diff) while it has already been paid earlier. Subsequently, the current_reward_counter value is higher than what it should be leading to delegators getting higher reward than expected. Following are the affected pools:

Pool	Σ Pending Rewards	Remaining Balance	Deficit
1	81752695034751	81752395035058	299_999_693
3	21637165950776	21637060488137	105_462_639
18	95674419419550	95674119419827	299_999_723
36	27038771498775	27038689155351	82_343_424
82	114205871664692	114205571664914	299_999_778

What was the bug?

The current_reward_counter is calculated from two parts, the last_recorded_reward_counter and the second part that depends on last_recorded_total_payouts. Both last_recorded_reward_counter and last_recorded_total_payouts are updated when there is change in pool points.

Both these values, last_recorded_reward_counter and last_recorded_total_payouts should be non-decreasing, monotonic values. In this specific case where ED on Kusama increased from 33_333_333 to 333_333_333, the last_recorded_total_payouts decreases to a value upto the difference in ED values ($3 \times 10^8$ KSM). How much is the decrease depends on how much rewards were paid to the reward account between the time of ED increase and when the pool points changed that leads to update of RewardPool values. This can be seen in the test here.

The culprit that caused this discrepancy was this line which roughly does the following:

last_recorded_total_payouts = free_balance - min_balance + total_rewards_claimed

Since the min_balance increases in our case, the last_recorded_total_payouts is overwritten and could decrease unless new free_balance has arrived since the ED increase in form of staking rewards that can cover the ED difference.

The current_reward_counter is calculated using following formulas. As you can see, decrease in last_recorded_total_payouts would increase the current_payout_balance and consecutively increase the current_reward_counter and the pending rewards for the delegators in the pool.

current_payout_balance = free_balance - min_balance + total_rewards_claimed - last_recorded_total_payouts

current_reward_counter = (current_payout_balance/pool_points) + last_recorded_reward_counter}

Note: Above formulas is simplified and ignores commission. None of the affected pools has commission so we can safely ignore it.

Would topping up the reward account help?

No. Any top up can be seen equivalent to staking rewards for a pool and will be distributed equally among all delegators of the pool leaving roughly the same deficit as before.

This is shown in the test as well.

What's the fix

Increasing the last_recorded_total_payouts to the amount of deficit would fix this issue.

Demonstrated in the test.

We should also fix how last_recorded_total_payouts is updated to make sure its never lower than the last value. It will prevent this issue from happening in future.

If ED is decreased, would that also cause an issue?

Decrease is fine since the difference is treated as extra rewards for the pool and is distributed among all delegators.

Ank4n · 2023-08-10T10:26:55Z

Zoomed out view of the issue

The increase in ED (min_balance) decreases the rewardable balance for every pool, the balance that pool thinks has already been rewarded. For some pools where we don't see this error, it would happen if new staking reward (or manual top-up by Ross) covered it before the reward pool was updated post the increase in ED.

In ideal scenario, the ED balance is covered by the creator while creating the pool. The manual top up also does not work if reward counter has already been updated since the ED change resulting in deficit.

Proposed fix

Add a new permissionless extrinsic to top up a pool. This top up would not be rewardable (by incrementing total_rewards_claimed with the top up). This would mean we will not require any logic changes on how last_recorded_total_payouts are updated.

Good thing about this is future increases in ED could be handled gracefully via this without any genuine reward lost by pool members.

Runtime upgrade to increment every pool's 'total_reward_claimed' to the change in ED. This will make every pool cover the deficit with staking rewards.

Bad thing about this: We are essentially claiming part of rewards of the pool without it being distributed to the pool members.
Why it still might be okay? Given that Ross already topped up each pool with new ED (which is seen as reward to the pool as well), incrementing total_reward_claimed does not mean any real loss of staking reward (just potentially loss of Ross rewards :D). Its also a very small amount so hopefully nobody minds but we can do another ED_diff top-up after this runtime upgrade via a governance proposal so there is no unfairness.

Do nothing. The deficit is not a lot to cause any practical problems. We can modify the state checks to tolerate differences up to the current deficit value (very ugly).

liamaharon · 2023-08-22T09:27:30Z

hey @Ank4n any progress on this? If we're not sure which approach maybe we can ask in element or the next frame steering

rossbulat · 2023-08-22T10:07:04Z

I am in favour of the runtime upgrade fix 👍

Ank4n · 2023-08-22T18:17:30Z

hey @Ank4n any progress on this? If we're not sure which approach maybe we can ask in element or the next frame steering

None of the proposed solution is perfect tbh. I am waiting to discuss this with the Staking team, but great suggestion, I can bring this up in the next frame steering.

kianenigma · 2023-08-23T10:21:12Z

A runtime upgrade is a one-off fix, and will require us to do more work the next time ED changes. Given this, I am in favor of a permissionelss transaction that can be executed with bots.

Bad thing about this: We are essentially claiming part of rewards of the pool without it being distributed to the pool members.
Why it still might be okay

Note that the ED that is put into any reward account is technically the responsibility of the pool operator to keep in there. So, regardless of the approach, the right flow, assuming you make the permissionless transaction fn fix_pool_reward_account, is this:

If the pool operator can afford the new ED without their account being destroyed, they should pay the deficit.
If not, the origin who called the transaction will pay the deficit.

In principle, this is similar to MinCreateBond changing (#444) as well; If it changes, anyone should be allowed to either force the pool operator to pay the deficit, or pay themselves.

Ank4n · 2023-08-24T09:45:04Z

If the pool operator can afford the new ED without their account being destroyed, they should pay the deficit.

Part of deficit was repaid by the staking rewards (depending on when the pool points changed after ED update). Technically this means even pools which are not showing deficit, should get a top up that comes from the pool operator. It might be difficult to do now since we don't track the original ED when the pool was created. But the real fix should start tracking it for future changes. It should also allow unbonding the difference by the depositor if ED has decreased.

For the existing pools that have some deficit, may be we don't have to get the perfect solution out and just get the current deficit fixed. This could be a dispatchable function doing what try state is doing, calculate reward of each pool member and hence the deficit in the reward account and allow a permissionless topup. Does that make sense?

I agree this is in principal needs similar solution as MinCreateBond.

Ank4n · 2023-08-24T09:46:13Z

Also note that there is another way to fix the issue for future (which I personally like) but it won't fix the deficits of the existing pools. If we never allow last_recorded_total_payouts to decrease (which makes sense, we can never take back what's already paid out), then the current reward counter would not jump abruptly, and any future staking rewards would go on towards first fixing the deficit. I tried this solution here (its commented out but if we put it in code, the deficit gets fixed automatically).

kianenigma · 2023-09-04T10:21:12Z

This could be a dispatchable function doing what try state is doing, calculate reward of each pool member and hence the deficit in the reward account and allow a permissionless topup. Does that make sense?

We certainly cannot iterate over all pool members of a pool in a dispatchable.

kianenigma · 2023-09-04T10:23:34Z

Also note that there is another way to fix the issue for future (which I personally like) but it won't fix the deficits of the existing pools. If we never allow last_recorded_total_payouts to decrease (which makes sense, we can never take back what's already paid out), then the current reward counter would not jump abruptly, and any future staking rewards would go on towards first fixing the deficit. I tried this solution here (its commented out but if we put it in code, the deficit gets fixed automatically).

This is an interesting fix and one that we can explore. If this is the long term approach, I agree that a one-off migration might be better and then we won't deal with this ever again. If not, I am still of the opinion that a one-off migration is too short sighted.

Closes paritytech/polkadot-sdk#158 In our last FRAME call it was discussed that a likely solution to the ED imbalances is lazily fixing the pools as they are interacted with. So, we should add some tiny tolerance to the try-state checks so next time there's an ED change they don't start failing until they've all been interacted with. ### Update 12 Sept Rather than adding tolerance, have replaced the `ensure` with a warning. --------- Co-authored-by: Ankan <[email protected]>

Ank4n · 2023-09-16T18:34:56Z

#1236 helps avoid try-state erroring because of the deficit in reward. But the actual fix for topping up the reward deficit as well as having a way in future to manage changes in ED will be taken care in this PR.

closes #158. partially addresses #226. Instead of fragile calculation of current balance by looking at `free balance - ED`, Nomination Pool now freezes ED in the pool reward account to restrict an account from going below minimum balance. This also has a nice side effect that if ED changes, we know how much is the imbalance in ED frozen in the pool and the current required ED. A pool operator can diligently top up the pool with the deficit in ED or vice versa, withdraw the excess they transferred to the pool. ## Notable changes - New call `adjust_pool_deposit`: Allows to top up the deficit or withdraw the excess deposited funds to the pool. - Uses Fungible trait (instead of Currency trait). Since NP was not doing any locking/reserving previously, no migration is needed for this. - One time migration of freezing ED from each of the existing pools (not very PoV friendly but fine for relay chain).

Closes paritytech#158 In our last FRAME call it was discussed that a likely solution to the ED imbalances is lazily fixing the pools as they are interacted with. So, we should add some tiny tolerance to the try-state checks so next time there's an ED change they don't start failing until they've all been interacted with. ### Update 12 Sept Rather than adding tolerance, have replaced the `ensure` with a warning. --------- Co-authored-by: Ankan <[email protected]>

closes paritytech#158. partially addresses paritytech#226. Instead of fragile calculation of current balance by looking at `free balance - ED`, Nomination Pool now freezes ED in the pool reward account to restrict an account from going below minimum balance. This also has a nice side effect that if ED changes, we know how much is the imbalance in ED frozen in the pool and the current required ED. A pool operator can diligently top up the pool with the deficit in ED or vice versa, withdraw the excess they transferred to the pool. ## Notable changes - New call `adjust_pool_deposit`: Allows to top up the deficit or withdraw the excess deposited funds to the pool. - Uses Fungible trait (instead of Currency trait). Since NP was not doing any locking/reserving previously, no migration is needed for this. - One time migration of freezing ED from each of the existing pools (not very PoV friendly but fine for relay chain).

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

liamaharon changed the title ~~nomination-pools try-state hook fails on Kusama state~~ nomination-pools try-state hook fails on Kusama Jul 2, 2023

liamaharon mentioned this issue Jul 4, 2023

nom-pool try-state: improve logs paritytech/substrate#14513

Merged

liamaharon self-assigned this Jul 27, 2023

kianenigma unassigned liamaharon Jul 27, 2023

liamaharon mentioned this issue Aug 1, 2023

docs: example of using cargo run with features paritytech/substrate#14696

Merged

Ank4n self-assigned this Aug 3, 2023

juangirini added the T1-runtime label Aug 24, 2023

juangirini transferred this issue from paritytech/substrate Aug 24, 2023

the-right-joyce added T1-FRAME This PR/Issue is related to core FRAME, the framework. and removed T1-runtime labels Aug 25, 2023

liamaharon mentioned this issue Aug 29, 2023

Add tolerance to nom pool pending rewards try-state #1236

Merged

Ank4n mentioned this issue Aug 29, 2023

[NPoS] Fix for Reward Deficit in the pool #1255

Merged

liamaharon closed this as completed in #1236 Sep 14, 2023

github-project-automation bot moved this from In Progress to Done in Runtime / FRAME Sep 14, 2023

Ank4n reopened this Sep 16, 2023

github-project-automation bot moved this from Done to Backlog in Runtime / FRAME Sep 16, 2023

gavofyork closed this as completed Sep 22, 2023

github-project-automation bot moved this from Backlog to Done in Runtime / FRAME Sep 22, 2023

liamaharon reopened this Sep 22, 2023

github-project-automation bot moved this from Done to Backlog in Runtime / FRAME Sep 22, 2023

Ank4n closed this as completed in #1255 Sep 29, 2023

github-project-automation bot moved this from Backlog to Done in Runtime / FRAME Sep 29, 2023

liamaharon mentioned this issue Oct 25, 2023

Broken Staking try-state invariants on Westend #2020

Closed

4 tasks

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024

Exchange pallet benchmarks (paritytech#158)

66616a8

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024

Exchange pallet benchmarks (paritytech#158)

8345315

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

Exchange pallet benchmarks (paritytech#158)

c063a69

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

Exchange pallet benchmarks (paritytech#158)

5f82c5e

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

Exchange pallet benchmarks (paritytech#158)

1036fc0

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

Exchange pallet benchmarks (paritytech#158)

3b897d5

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

Exchange pallet benchmarks (paritytech#158)

889432c

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

Exchange pallet benchmarks (paritytech#158)

4a1fbc5

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 10, 2024

Exchange pallet benchmarks (paritytech#158)

bd06269

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 10, 2024

Exchange pallet benchmarks (paritytech#158)

191b488

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

jonathanudd pushed a commit to jonathanudd/polkadot-sdk that referenced this issue Apr 10, 2024

Exchange pallet benchmarks (paritytech#158)

00bd13f

* exchange benchmarks: framework * updated comment about tx size Co-authored-by: Tomasz Drwięga <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nomination-pools try-state hook fails on Kusama #158

nomination-pools try-state hook fails on Kusama #158

liamaharon commented Jul 2, 2023 •

edited

Loading

liamaharon commented Jul 2, 2023

ggwpez commented Jul 2, 2023 •

edited

Loading

rossbulat commented Jul 3, 2023 •

edited

Loading

liamaharon commented Jul 3, 2023

rossbulat commented Jul 3, 2023 •

edited

Loading

liamaharon commented Jul 4, 2023

rossbulat commented Jul 4, 2023

liamaharon commented Jul 6, 2023

kianenigma commented Jul 27, 2023

Ank4n commented Jul 27, 2023

Ank4n commented Aug 9, 2023

Ank4n commented Aug 10, 2023 •

edited

Loading

liamaharon commented Aug 22, 2023 •

edited

Loading

rossbulat commented Aug 22, 2023

Ank4n commented Aug 22, 2023

kianenigma commented Aug 23, 2023

Ank4n commented Aug 24, 2023

Ank4n commented Aug 24, 2023

kianenigma commented Sep 4, 2023

kianenigma commented Sep 4, 2023

Ank4n commented Sep 16, 2023

nomination-pools try-state hook fails on Kusama #158

nomination-pools try-state hook fails on Kusama #158

Comments

liamaharon commented Jul 2, 2023 • edited Loading

Description of bug

Polkadot also has a failing invariant:

Reproduce

TODO

liamaharon commented Jul 2, 2023

ggwpez commented Jul 2, 2023 • edited Loading

rossbulat commented Jul 3, 2023 • edited Loading

liamaharon commented Jul 3, 2023

rossbulat commented Jul 3, 2023 • edited Loading

liamaharon commented Jul 4, 2023

rossbulat commented Jul 4, 2023

liamaharon commented Jul 6, 2023

kianenigma commented Jul 27, 2023

Ank4n commented Jul 27, 2023

Ank4n commented Aug 9, 2023

Why this happened?

What was the bug?

Would topping up the reward account help?

What's the fix

If ED is decreased, would that also cause an issue?

Ank4n commented Aug 10, 2023 • edited Loading

Zoomed out view of the issue

Proposed fix

liamaharon commented Aug 22, 2023 • edited Loading

rossbulat commented Aug 22, 2023

Ank4n commented Aug 22, 2023

kianenigma commented Aug 23, 2023

Ank4n commented Aug 24, 2023

Ank4n commented Aug 24, 2023

kianenigma commented Sep 4, 2023

kianenigma commented Sep 4, 2023

Ank4n commented Sep 16, 2023

liamaharon commented Jul 2, 2023 •

edited

Loading

ggwpez commented Jul 2, 2023 •

edited

Loading

rossbulat commented Jul 3, 2023 •

edited

Loading

rossbulat commented Jul 3, 2023 •

edited

Loading

Ank4n commented Aug 10, 2023 •

edited

Loading

liamaharon commented Aug 22, 2023 •

edited

Loading