Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nomination-pools try-state hook fails on Kusama #158

Closed
1 task
liamaharon opened this issue Jul 2, 2023 · 21 comments · Fixed by #1236 or #1255
Closed
1 task

nomination-pools try-state hook fails on Kusama #158

liamaharon opened this issue Jul 2, 2023 · 21 comments · Fixed by #1236 or #1255
Assignees
Labels
T1-FRAME This PR/Issue is related to core FRAME, the framework.

Comments

@liamaharon
Copy link
Contributor

liamaharon commented Jul 2, 2023

Description of bug

A try-state check is failing on Kusama

Screenshot 2023-07-02 at 12 31 26

Failing invariant:

https://github.com/paritytech/substrate/blob/9e8b2f63fb57c5d0fd45da1478508a4509d530e6/frame/nomination-pools/src/lib.rs#L3135-L3139

Polkadot also has a failing invariant:

https://github.com/paritytech/substrate/blob/9e8b2f63fb57c5d0fd45da1478508a4509d530e6/frame/collective/src/lib.rs#L1038-L1041

Reproduce

  1. Clone polkadot and substrate into the same dir

  2. Sync Kusama (or use a remote node on the next step)

  3. Execute

cd ../polkadot && cargo build -r -p kusama-runtime --features try-runtime && cd ../substrate && cargo run -r --features=try-runtime try-runtime --runtime ../polkadot/target/release/wbuild/kusama-runtime/kusama_runtime.wasm execute-block live --uri ws://localhost:9944

TODO

  • also check polkadot/westend/rococo
@liamaharon
Copy link
Contributor Author

cc @Szegoo

@liamaharon liamaharon changed the title nomination-pools try-state hook fails on Kusama state nomination-pools try-state hook fails on Kusama Jul 2, 2023
@ggwpez
Copy link
Member

ggwpez commented Jul 2, 2023

I think ED was actually changed 10x at some point recently. I think this one paritytech/polkadot#6372, but it needs a bit closer investigation first. @rossbulat

@rossbulat
Copy link

rossbulat commented Jul 3, 2023

I think ED was actually changed 10x at some point recently. I think this one paritytech/polkadot#6372, but it needs a bit closer investigation first. @rossbulat

So the hypothesis is that because ED on Kusama has increased 10x, pool reward accounts are now reserving more balance and therefore the actual pending rewards balance is less than the current reward counter?

if this is correct, I imagine an OpenGov referendum to inject balance from the treasury into each pool would fix this?

Reached out to @liamaharon, easiest way to check this (i think) would be to send a small amount of KSM to pool 40's reward pool (first one on logs), re-run try-state, & verify pool 40 no longer contains the warning.

@liamaharon
Copy link
Contributor Author

Verified your transfer of 0.1 KSM fixed nom pool 40 @rossbulat.

We could fix this by OpenGov or in a runtime upgrade, either is fine I think.

@rossbulat
Copy link

rossbulat commented Jul 3, 2023

Verified your transfer of 0.1 KSM fixed nom pool 40 @rossbulat.

We could fix this by OpenGov or in a runtime upgrade, either is fine I think.

I sent all the reward pools the ED amount: https://kusama.subscan.io/extrinsic/0x74dcbb20df2d8e2548cbe93d251ed290259506d146c145d04841fdea7b6bff7b

Hopefully this will resolve the runtime errors, please let me know.

@liamaharon
Copy link
Contributor Author

Screenshot 2023-07-04 at 18 28 10

Almost! Looks like pool 1 still needs a few planks

@rossbulat
Copy link

Ah perhaps that once was missed, just topped it up.

@liamaharon
Copy link
Contributor Author

Hmm, both pending rewards and remaining balance have decreased since my last screenshot (and your top up) and the pending are > remaining balance again. Any ideas what may be going on?

Screenshot 2023-07-06 at 14 37 58

@liamaharon liamaharon self-assigned this Jul 27, 2023
@kianenigma
Copy link
Contributor

The underlying issue, if real and something beyond the change to ED should be looked into by @paritytech/staking-core.

@Ank4n
Copy link
Contributor

Ank4n commented Jul 27, 2023

For the hypothesis that this might be due to ED increase, if we add 300_000_000 or 0.0003 KSM (the increase in ED) to the reward account, the remaining should go above the pending rewards.

Screenshot 2023-07-06 at 14 37 58

Ross's top up should have fixed this. Funny thing is the difference between pending rewards and remaining balance comes up to 299999712 which is very close to theED increase and would perfectly satisfy our assumption if not for the Ross's topup. I will have a deeper look into this.

@Ank4n
Copy link
Contributor

Ank4n commented Aug 9, 2023

I have reproduced this issue in a test here. Would appreciate if someone can review this and make sure everything looks good and makes sense.

Why this happened?

Pending reward calculation of a delegator depends on the current_reward_counter of the pool. Due to a bug in the logic (affected by ED increase), the pool thinks it has not paid some rewards (equal or less than the ED diff) while it has already been paid earlier. Subsequently, the current_reward_counter value is higher than what it should be leading to delegators getting higher reward than expected. Following are the affected pools:

Pool Σ Pending Rewards Remaining Balance Deficit
1 81752695034751 81752395035058 299_999_693
3 21637165950776 21637060488137 105_462_639
18 95674419419550 95674119419827 299_999_723
36 27038771498775 27038689155351 82_343_424
82 114205871664692 114205571664914 299_999_778

What was the bug?

The current_reward_counter is calculated from two parts, the last_recorded_reward_counter and the second part that depends on last_recorded_total_payouts. Both last_recorded_reward_counter and last_recorded_total_payouts are updated when there is change in pool points.

Both these values, last_recorded_reward_counter and last_recorded_total_payouts should be non-decreasing, monotonic values. In this specific case where ED on Kusama increased from 33_333_333 to 333_333_333, the last_recorded_total_payouts decreases to a value upto the difference in ED values ($3 \times 10^8$ KSM). How much is the decrease depends on how much rewards were paid to the reward account between the time of ED increase and when the pool points changed that leads to update of RewardPool values. This can be seen in the test here.

The culprit that caused this discrepancy was this line which roughly does the following:

last_recorded_total_payouts = free_balance - min_balance + total_rewards_claimed

Since the min_balance increases in our case, the last_recorded_total_payouts is overwritten and could decrease unless new free_balance has arrived since the ED increase in form of staking rewards that can cover the ED difference.

The current_reward_counter is calculated using following formulas. As you can see, decrease in last_recorded_total_payouts would increase the current_payout_balance and consecutively increase the current_reward_counter and the pending rewards for the delegators in the pool.

current_payout_balance = free_balance - min_balance + total_rewards_claimed - last_recorded_total_payouts
current_reward_counter = (current_payout_balance/pool_points) + last_recorded_reward_counter}

Note: Above formulas is simplified and ignores commission. None of the affected pools has commission so we can safely ignore it.

Would topping up the reward account help?

No. Any top up can be seen equivalent to staking rewards for a pool and will be distributed equally among all delegators of the pool leaving roughly the same deficit as before.

This is shown in the test as well.

What's the fix

Increasing the last_recorded_total_payouts to the amount of deficit would fix this issue.

Demonstrated in the test.

We should also fix how last_recorded_total_payouts is updated to make sure its never lower than the last value. It will prevent this issue from happening in future.

If ED is decreased, would that also cause an issue?

Decrease is fine since the difference is treated as extra rewards for the pool and is distributed among all delegators.

@Ank4n
Copy link
Contributor

Ank4n commented Aug 10, 2023

Zoomed out view of the issue

The increase in ED (min_balance) decreases the rewardable balance for every pool, the balance that pool thinks has already been rewarded. For some pools where we don't see this error, it would happen if new staking reward (or manual top-up by Ross) covered it before the reward pool was updated post the increase in ED.

In ideal scenario, the ED balance is covered by the creator while creating the pool. The manual top up also does not work if reward counter has already been updated since the ED change resulting in deficit.

Proposed fix

  1. Add a new permissionless extrinsic to top up a pool. This top up would not be rewardable (by incrementing total_rewards_claimed with the top up). This would mean we will not require any logic changes on how last_recorded_total_payouts are updated.

Good thing about this is future increases in ED could be handled gracefully via this without any genuine reward lost by pool members.

  1. Runtime upgrade to increment every pool's 'total_reward_claimed' to the change in ED. This will make every pool cover the deficit with staking rewards.

Bad thing about this: We are essentially claiming part of rewards of the pool without it being distributed to the pool members.
Why it still might be okay? Given that Ross already topped up each pool with new ED (which is seen as reward to the pool as well), incrementing total_reward_claimed does not mean any real loss of staking reward (just potentially loss of Ross rewards :D). Its also a very small amount so hopefully nobody minds but we can do another ED_diff top-up after this runtime upgrade via a governance proposal so there is no unfairness.

  1. Do nothing. The deficit is not a lot to cause any practical problems. We can modify the state checks to tolerate differences up to the current deficit value (very ugly).

@liamaharon
Copy link
Contributor Author

liamaharon commented Aug 22, 2023

hey @Ank4n any progress on this? If we're not sure which approach maybe we can ask in element or the next frame steering

@rossbulat
Copy link

I am in favour of the runtime upgrade fix 👍

@Ank4n
Copy link
Contributor

Ank4n commented Aug 22, 2023

hey @Ank4n any progress on this? If we're not sure which approach maybe we can ask in element or the next frame steering

None of the proposed solution is perfect tbh. I am waiting to discuss this with the Staking team, but great suggestion, I can bring this up in the next frame steering.

@kianenigma
Copy link
Contributor

A runtime upgrade is a one-off fix, and will require us to do more work the next time ED changes. Given this, I am in favor of a permissionelss transaction that can be executed with bots.

Bad thing about this: We are essentially claiming part of rewards of the pool without it being distributed to the pool members.
Why it still might be okay

Note that the ED that is put into any reward account is technically the responsibility of the pool operator to keep in there. So, regardless of the approach, the right flow, assuming you make the permissionless transaction fn fix_pool_reward_account, is this:

  1. If the pool operator can afford the new ED without their account being destroyed, they should pay the deficit.
  2. If not, the origin who called the transaction will pay the deficit.

In principle, this is similar to MinCreateBond changing (#444) as well; If it changes, anyone should be allowed to either force the pool operator to pay the deficit, or pay themselves.

@Ank4n
Copy link
Contributor

Ank4n commented Aug 24, 2023

  1. If the pool operator can afford the new ED without their account being destroyed, they should pay the deficit.

Part of deficit was repaid by the staking rewards (depending on when the pool points changed after ED update). Technically this means even pools which are not showing deficit, should get a top up that comes from the pool operator. It might be difficult to do now since we don't track the original ED when the pool was created. But the real fix should start tracking it for future changes. It should also allow unbonding the difference by the depositor if ED has decreased.

For the existing pools that have some deficit, may be we don't have to get the perfect solution out and just get the current deficit fixed. This could be a dispatchable function doing what try state is doing, calculate reward of each pool member and hence the deficit in the reward account and allow a permissionless topup. Does that make sense?

I agree this is in principal needs similar solution as MinCreateBond.

@Ank4n
Copy link
Contributor

Ank4n commented Aug 24, 2023

Also note that there is another way to fix the issue for future (which I personally like) but it won't fix the deficits of the existing pools. If we never allow last_recorded_total_payouts to decrease (which makes sense, we can never take back what's already paid out), then the current reward counter would not jump abruptly, and any future staking rewards would go on towards first fixing the deficit. I tried this solution here (its commented out but if we put it in code, the deficit gets fixed automatically).

@juangirini juangirini transferred this issue from paritytech/substrate Aug 24, 2023
@the-right-joyce the-right-joyce added T1-FRAME This PR/Issue is related to core FRAME, the framework. and removed T1-runtime labels Aug 25, 2023
@kianenigma
Copy link
Contributor

This could be a dispatchable function doing what try state is doing, calculate reward of each pool member and hence the deficit in the reward account and allow a permissionless topup. Does that make sense?

We certainly cannot iterate over all pool members of a pool in a dispatchable.

@kianenigma
Copy link
Contributor

Also note that there is another way to fix the issue for future (which I personally like) but it won't fix the deficits of the existing pools. If we never allow last_recorded_total_payouts to decrease (which makes sense, we can never take back what's already paid out), then the current reward counter would not jump abruptly, and any future staking rewards would go on towards first fixing the deficit. I tried this solution here (its commented out but if we put it in code, the deficit gets fixed automatically).

This is an interesting fix and one that we can explore. If this is the long term approach, I agree that a one-off migration might be better and then we won't deal with this ever again. If not, I am still of the opinion that a one-off migration is too short sighted.

@github-project-automation github-project-automation bot moved this from In Progress to Done in Runtime / FRAME Sep 14, 2023
taqtiqa-mark pushed a commit to subversive-upstream/polkadot-sdk-substrate-frame-nomination-pools that referenced this issue Sep 14, 2023
Closes paritytech/polkadot-sdk#158

In our last FRAME call it was discussed that a likely solution to the ED
imbalances is lazily fixing the pools as they are interacted with.

So, we should add some tiny tolerance to the try-state checks so next
time there's an ED change they don't start failing until they've all
been interacted with.

### Update 12 Sept

Rather than adding tolerance, have replaced the `ensure` with a warning.

---------

Co-authored-by: Ankan <[email protected]>
@Ank4n
Copy link
Contributor

Ank4n commented Sep 16, 2023

#1236 helps avoid try-state erroring because of the deficit in reward. But the actual fix for topping up the reward deficit as well as having a way in future to manage changes in ED will be taken care in this PR.

@Ank4n Ank4n reopened this Sep 16, 2023
@github-project-automation github-project-automation bot moved this from Done to Backlog in Runtime / FRAME Sep 16, 2023
@github-project-automation github-project-automation bot moved this from Backlog to Done in Runtime / FRAME Sep 22, 2023
@liamaharon liamaharon reopened this Sep 22, 2023
@github-project-automation github-project-automation bot moved this from Done to Backlog in Runtime / FRAME Sep 22, 2023
Ank4n added a commit that referenced this issue Sep 29, 2023
closes #158.
partially addresses
#226.

Instead of fragile calculation of current balance by looking at `free
balance - ED`, Nomination Pool now freezes ED in the pool reward account
to restrict an account from going below minimum balance. This also has a
nice side effect that if ED changes, we know how much is the imbalance
in ED frozen in the pool and the current required ED. A pool operator
can diligently top up the pool with the deficit in ED or vice versa,
withdraw the excess they transferred to the pool.

## Notable changes
- New call `adjust_pool_deposit`: Allows to top up the deficit or
withdraw the excess deposited funds to the pool.
- Uses Fungible trait (instead of Currency trait). Since NP was not
doing any locking/reserving previously, no migration is needed for this.
- One time migration of freezing ED from each of the existing pools (not
very PoV friendly but fine for relay chain).
@github-project-automation github-project-automation bot moved this from Backlog to Done in Runtime / FRAME Sep 29, 2023
bgallois pushed a commit to duniter/duniter-polkadot-sdk that referenced this issue Mar 25, 2024
Closes paritytech#158

In our last FRAME call it was discussed that a likely solution to the ED
imbalances is lazily fixing the pools as they are interacted with.

So, we should add some tiny tolerance to the try-state checks so next
time there's an ED change they don't start failing until they've all
been interacted with.

### Update 12 Sept

Rather than adding tolerance, have replaced the `ensure` with a warning.

---------

Co-authored-by: Ankan <[email protected]>
bgallois pushed a commit to duniter/duniter-polkadot-sdk that referenced this issue Mar 25, 2024
closes paritytech#158.
partially addresses
paritytech#226.

Instead of fragile calculation of current balance by looking at `free
balance - ED`, Nomination Pool now freezes ED in the pool reward account
to restrict an account from going below minimum balance. This also has a
nice side effect that if ED changes, we know how much is the imbalance
in ED frozen in the pool and the current required ED. A pool operator
can diligently top up the pool with the deficit in ED or vice versa,
withdraw the excess they transferred to the pool.

## Notable changes
- New call `adjust_pool_deposit`: Allows to top up the deficit or
withdraw the excess deposited funds to the pool.
- Uses Fungible trait (instead of Currency trait). Since NP was not
doing any locking/reserving previously, no migration is needed for this.
- One time migration of freezing ED from each of the existing pools (not
very PoV friendly but fine for relay chain).
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 10, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 10, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
jonathanudd pushed a commit to jonathanudd/polkadot-sdk that referenced this issue Apr 10, 2024
* exchange benchmarks: framework

* updated comment about tx size

Co-authored-by: Tomasz Drwięga <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T1-FRAME This PR/Issue is related to core FRAME, the framework.
Projects
Status: Done
7 participants