Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Staking] check_payees try-state check failing in Westend #3245

Open
1 task done
gpestana opened this issue Feb 7, 2024 · 4 comments
Open
1 task done

[Staking] check_payees try-state check failing in Westend #3245

gpestana opened this issue Feb 7, 2024 · 4 comments
Assignees
Labels
T2-pallets This PR/Issue is related to a particular pallet. T10-tests This PR/Issue is related to tests.

Comments

@gpestana
Copy link
Contributor

gpestana commented Feb 7, 2024

The check_payees try-state check in Staking is failing in Westend. Figure out what is the reason and fix it.

[2024-02-07T16:13:20Z ERROR runtime::frame-support] ❌ "Staking" try_state checks failed: Other("number of entries in payee storage items does not match the number of bonded ledgers")

Example CI job error: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/5142158#L2515

Todo before closing:

  • re-set required tag in the check-runtime-migration-westend CI job.
@gpestana gpestana added T2-pallets This PR/Issue is related to a particular pallet. T10-tests This PR/Issue is related to tests. labels Feb 7, 2024
@gpestana gpestana self-assigned this Feb 7, 2024
@gpestana
Copy link
Contributor Author

gpestana commented Feb 7, 2024

It seems that a staking ledger has been removed without clearing up the bonded and payee entry.

[2024-02-07T22:18:11Z INFO  runtime::staking] [19451704] 💸  count Ledger 72560, count Payee: 72561, count Bonded: 72561

The (old) controller account that is faulty is 5CqVcAhUzKbMMwrZiJDSAwXkmoYLpaYBXkobUAp3biVAQoXc

[2024-02-07T22:36:17Z INFO  runtime::staking] [19451704] 💸  controller that does not have a bonded entry: 2228ce54942b2da458b212dc8cb348f59752c28443538ea92324f9b890352611 (5CqVcAhU...)

Timeline:


Notes

  • The fast unstake called into kill_stash(stash) which, although the FastUnstake event has success: Ok, it seems it did not clean the corresponding Bonded<T>,Payee<T> and SlashingSpans<T> entries for that stash.
// for `5HHaaUvCwAb16KkKy7cpMnuzUgLWH94gEvMVXugc69ZEDfkj` stash

staking.slashingSpans: Option<PalletStakingSlashingSlashingSpans>
{
  spanIndex: 2
  lastStart: 5,639
  lastNonzeroSlash: 5,638
  prior: [
    5
  ]
}

staking.bonded: Option<AccountId32>
5CqVcAhUzKbMMwrZiJDSAwXkmoYLpaYBXkobUAp3biVAQoXc

@gpestana
Copy link
Contributor Author

gpestana commented Feb 8, 2024

Solution

Reap the does not work, as it fails with staking.NotController since the staking ledger does not exist anymore in storage. This will require a small migration which I can work on.

  • sudo call to remove the entries in the storage
  • migration to remove the lingering Bonded and Payee entry for stash 5HHaaUvCwAb16KkKy7cpMnuzUgLWH94gEvMVXugc69ZEDfkj

Done. The solution was to set_storage of the staking.ledger(5CqVcAhUzKbMMwrZiJDSAwXkmoYLpaYBXkobUAp3biVAQoXc) with a ledger where the total = 0 and then call reap_stash to clean all the storage items of this ledger.

@gpestana
Copy link
Contributor Author

The try-state is failing again, now with 16 accounts that are faulty. A recent deprecate_controller_batch seems to have been the culprit here. The stashes affected seem to also have bonded in the same timespan as the last stash that has been fixed.

Looking again into this and will check if this may happen to other bonded stashes (also in Kusama and Polkadot).

The solution should be the same as described above, for all the affected stashes.

github-merge-queue bot pushed a commit that referenced this issue Mar 14, 2024
Currently, the staking logic does not prevent a controller from becoming
a stash of *another* ledger (introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850)).
Given that the remaining of the code expects that never happens, bonding
a ledger with a stash that is a controller of another ledger may lead to
data inconsistencies and data losses in bonded ledgers. For more
detailed explanation of this issue:
https://hackmd.io/@gpestana/HJoBm2tqo/%2FTPdi28H7Qc2mNUqLSMn15w

In a nutshell, when fetching a ledger with a given controller, we may be
end up getting the wrong ledger which can lead to unexpected ledger
states.

This PR also ensures that `set_controller` does not lead to data
inconsistencies in the staking ledger and bonded storage in the case
when a controller of a stash is a stash of *another* ledger. and
improves the staking `try-runtime` checks to catch potential issues with
the storage preemptively.

In summary, there are two important cases here:

1. **"Sane" double bonded ledger**

When a controller of a ledger is a stash of *another* ledger. In this
case, we have:

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, C) // B is also a stash of another ledger
(C, D)

> Ledger(controller)
Ledger(B) = L_a (stash = A)
Ledger(C) = L_b (stash = B)
Ledger(D) = L_c (stash = C)
```

In this case, the ledgers can be mutated and all operations are OK.
However, we should not allow `set_controller` to be called if it means
it results in a "corrupt" double bonded ledger (see below).

3. **"Corrupt" double bonded ledger**

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, B)
(C, D)
```
In this case, B is a stash and controller AND is corrupted, since B is
responsible for 2 ledgers which is not correct and will lead to
inconsistent states. Thus, in this case, in this PR we are preventing
these ledgers from mutating (i.e. operations like bonding extra etc)
until the ledger is brought back to a consistent state.

--- 

**Changes**:
- Checks if stash is already a controller when calling `Call::bond`
(fixes the regression introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850));
- Ensures that all fetching ledgers from storage are done through the
`StakingLedger` API;
- Ensures that -- when fetching a ledger from storage using the
`StakingLedger` API --, a `Error::BadState` is returned if the ledger
bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
`bond_extra`, `set_controller`, etc) its state and avoid further data
inconsistencies.
- Prevents stashes which are controllers or another ledger from calling
`set_controller`, since that may lead to a bad state.
- Adds further try-state runtime checks that check if there are ledgers
in a bad state based on their bonded metadata.

Related to #3245

---------

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
@gpestana
Copy link
Contributor Author

gpestana commented Mar 18, 2024

For the record and for future reference, the root issue here is that the current staking logic is not preventing controllers from becoming stashes of different ledgers. This may lead to an account being stash of a ledger and a controller of another ledger. The 2nd order issue is that set_controller is not expecting controllers to be stashes of other ledgers. So ledgers in this state may end up corrupting the ledger data and metadata when calling set_controller. For more details on this issue and backstop patch, refer to #3639.

A patch release (v1.1.3) has been proposed for Kusama and Polkadot to prevent stashes from becoming controllers of other ledgers and backstop the corruption issue. The plan now is to recover the corrupted ledgers across all chains. Once that's done, we can close this issue.

dharjeezy pushed a commit to dharjeezy/polkadot-sdk that referenced this issue Mar 24, 2024
Currently, the staking logic does not prevent a controller from becoming
a stash of *another* ledger (introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850)).
Given that the remaining of the code expects that never happens, bonding
a ledger with a stash that is a controller of another ledger may lead to
data inconsistencies and data losses in bonded ledgers. For more
detailed explanation of this issue:
https://hackmd.io/@gpestana/HJoBm2tqo/%2FTPdi28H7Qc2mNUqLSMn15w

In a nutshell, when fetching a ledger with a given controller, we may be
end up getting the wrong ledger which can lead to unexpected ledger
states.

This PR also ensures that `set_controller` does not lead to data
inconsistencies in the staking ledger and bonded storage in the case
when a controller of a stash is a stash of *another* ledger. and
improves the staking `try-runtime` checks to catch potential issues with
the storage preemptively.

In summary, there are two important cases here:

1. **"Sane" double bonded ledger**

When a controller of a ledger is a stash of *another* ledger. In this
case, we have:

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, C) // B is also a stash of another ledger
(C, D)

> Ledger(controller)
Ledger(B) = L_a (stash = A)
Ledger(C) = L_b (stash = B)
Ledger(D) = L_c (stash = C)
```

In this case, the ledgers can be mutated and all operations are OK.
However, we should not allow `set_controller` to be called if it means
it results in a "corrupt" double bonded ledger (see below).

3. **"Corrupt" double bonded ledger**

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, B)
(C, D)
```
In this case, B is a stash and controller AND is corrupted, since B is
responsible for 2 ledgers which is not correct and will lead to
inconsistent states. Thus, in this case, in this PR we are preventing
these ledgers from mutating (i.e. operations like bonding extra etc)
until the ledger is brought back to a consistent state.

--- 

**Changes**:
- Checks if stash is already a controller when calling `Call::bond`
(fixes the regression introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850));
- Ensures that all fetching ledgers from storage are done through the
`StakingLedger` API;
- Ensures that -- when fetching a ledger from storage using the
`StakingLedger` API --, a `Error::BadState` is returned if the ledger
bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
`bond_extra`, `set_controller`, etc) its state and avoid further data
inconsistencies.
- Prevents stashes which are controllers or another ledger from calling
`set_controller`, since that may lead to a bad state.
- Adds further try-state runtime checks that check if there are ledgers
in a bad state based on their bonded metadata.

Related to paritytech#3245

---------

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
bgallois pushed a commit to duniter/duniter-polkadot-sdk that referenced this issue Mar 25, 2024
Currently, the staking logic does not prevent a controller from becoming
a stash of *another* ledger (introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850)).
Given that the remaining of the code expects that never happens, bonding
a ledger with a stash that is a controller of another ledger may lead to
data inconsistencies and data losses in bonded ledgers. For more
detailed explanation of this issue:
https://hackmd.io/@gpestana/HJoBm2tqo/%2FTPdi28H7Qc2mNUqLSMn15w

In a nutshell, when fetching a ledger with a given controller, we may be
end up getting the wrong ledger which can lead to unexpected ledger
states.

This PR also ensures that `set_controller` does not lead to data
inconsistencies in the staking ledger and bonded storage in the case
when a controller of a stash is a stash of *another* ledger. and
improves the staking `try-runtime` checks to catch potential issues with
the storage preemptively.

In summary, there are two important cases here:

1. **"Sane" double bonded ledger**

When a controller of a ledger is a stash of *another* ledger. In this
case, we have:

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, C) // B is also a stash of another ledger
(C, D)

> Ledger(controller)
Ledger(B) = L_a (stash = A)
Ledger(C) = L_b (stash = B)
Ledger(D) = L_c (stash = C)
```

In this case, the ledgers can be mutated and all operations are OK.
However, we should not allow `set_controller` to be called if it means
it results in a "corrupt" double bonded ledger (see below).

3. **"Corrupt" double bonded ledger**

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, B)
(C, D)
```
In this case, B is a stash and controller AND is corrupted, since B is
responsible for 2 ledgers which is not correct and will lead to
inconsistent states. Thus, in this case, in this PR we are preventing
these ledgers from mutating (i.e. operations like bonding extra etc)
until the ledger is brought back to a consistent state.

--- 

**Changes**:
- Checks if stash is already a controller when calling `Call::bond`
(fixes the regression introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850));
- Ensures that all fetching ledgers from storage are done through the
`StakingLedger` API;
- Ensures that -- when fetching a ledger from storage using the
`StakingLedger` API --, a `Error::BadState` is returned if the ledger
bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
`bond_extra`, `set_controller`, etc) its state and avoid further data
inconsistencies.
- Prevents stashes which are controllers or another ledger from calling
`set_controller`, since that may lead to a bad state.
- Adds further try-state runtime checks that check if there are ledgers
in a bad state based on their bonded metadata.

Related to paritytech#3245

---------

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
gpestana added a commit that referenced this issue Mar 27, 2024
Backport for 1.7: #3639

Relevant Issues:
- #3245
gpestana added a commit that referenced this issue Mar 27, 2024
Backport for 1.7:
- #3639
- #3706

Relevant Issues:
- #3245
github-merge-queue bot pushed a commit that referenced this issue Mar 27, 2024
This PR adds a new extrinsic `Call::restore_ledger ` gated by
`StakingAdmin` origin that restores a corrupted staking ledger. This
extrinsic will be used to recover ledgers that were affected by the
issue discussed in
#3245.

The extrinsic will re-write the storage items associated with a stash
account provided as input parameter. The data used to reset the ledger
can be either i) fetched on-chain or ii) partially/totally set by the
input parameters of the call.

In order to use on-chain data to restore the staking locks, we need a
way to read the current lock in the balances pallet. This PR adds a
`InspectLockableCurrency` trait and implements it in the pallet
balances. An alternative would be to tightly couple staking with the
pallet balances but that's inelegant (an example of how it would look
like in [this
branch](https://github.com/paritytech/polkadot-sdk/tree/gpestana/ledger-badstate-clean_tightly)).

More details on the type of corruptions and corresponding fixes
https://hackmd.io/DLb5jEYWSmmvqXC9ae4yRg?view#/

We verified that the `Call::restore_ledger` does fix all current
corrupted ledgers in Polkadot and Kusama. You can verify it here
https://hackmd.io/v-XNrEoGRpe7APR-EZGhOA.

**Changes introduced**
- Adds `Call::restore_ledger ` extrinsic to recover a corrupted ledger;
- Adds trait `frame_support::traits::currency::InspectLockableCurrency`
to allow external pallets to read current locks given an account and
lock ID;
- Implements the `InspectLockableCurrency` in the pallet-balances.
- Adds staking locks try-runtime checks
(#3751)

**Todo**
- [x] benchmark `Call::restore_ledger`
- [x] throughout testing of all ledger recovering cases
- [x] consider adding the staking locks try-runtime checks to this PR
(#3751)
- [x] simulate restoring all ledgers
(https://hackmd.io/Dsa2tvhISNSs7zcqriTaxQ?view) in Polkadot and Kusama
using chopsticks -- https://hackmd.io/v-XNrEoGRpe7APR-EZGhOA

Related to #3245
Closes #3751

---------

Co-authored-by: command-bot <>
bkchr pushed a commit that referenced this issue Mar 28, 2024
Backports for 1.7: 
- #3639
- #3706

Relevant Issues:
- #3245
dharjeezy pushed a commit to dharjeezy/polkadot-sdk that referenced this issue Apr 9, 2024
This PR adds a new extrinsic `Call::restore_ledger ` gated by
`StakingAdmin` origin that restores a corrupted staking ledger. This
extrinsic will be used to recover ledgers that were affected by the
issue discussed in
paritytech#3245.

The extrinsic will re-write the storage items associated with a stash
account provided as input parameter. The data used to reset the ledger
can be either i) fetched on-chain or ii) partially/totally set by the
input parameters of the call.

In order to use on-chain data to restore the staking locks, we need a
way to read the current lock in the balances pallet. This PR adds a
`InspectLockableCurrency` trait and implements it in the pallet
balances. An alternative would be to tightly couple staking with the
pallet balances but that's inelegant (an example of how it would look
like in [this
branch](https://github.com/paritytech/polkadot-sdk/tree/gpestana/ledger-badstate-clean_tightly)).

More details on the type of corruptions and corresponding fixes
https://hackmd.io/DLb5jEYWSmmvqXC9ae4yRg?view#/

We verified that the `Call::restore_ledger` does fix all current
corrupted ledgers in Polkadot and Kusama. You can verify it here
https://hackmd.io/v-XNrEoGRpe7APR-EZGhOA.

**Changes introduced**
- Adds `Call::restore_ledger ` extrinsic to recover a corrupted ledger;
- Adds trait `frame_support::traits::currency::InspectLockableCurrency`
to allow external pallets to read current locks given an account and
lock ID;
- Implements the `InspectLockableCurrency` in the pallet-balances.
- Adds staking locks try-runtime checks
(paritytech#3751)

**Todo**
- [x] benchmark `Call::restore_ledger`
- [x] throughout testing of all ledger recovering cases
- [x] consider adding the staking locks try-runtime checks to this PR
(paritytech#3751)
- [x] simulate restoring all ledgers
(https://hackmd.io/Dsa2tvhISNSs7zcqriTaxQ?view) in Polkadot and Kusama
using chopsticks -- https://hackmd.io/v-XNrEoGRpe7APR-EZGhOA

Related to paritytech#3245
Closes paritytech#3751

---------

Co-authored-by: command-bot <>
gpestana added a commit that referenced this issue Apr 11, 2024
Currently, the staking logic does not prevent a controller from becoming
a stash of *another* ledger (introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850)).
Given that the remaining of the code expects that never happens, bonding
a ledger with a stash that is a controller of another ledger may lead to
data inconsistencies and data losses in bonded ledgers. For more
detailed explanation of this issue:
https://hackmd.io/@gpestana/HJoBm2tqo/%2FTPdi28H7Qc2mNUqLSMn15w

In a nutshell, when fetching a ledger with a given controller, we may be
end up getting the wrong ledger which can lead to unexpected ledger
states.

This PR also ensures that `set_controller` does not lead to data
inconsistencies in the staking ledger and bonded storage in the case
when a controller of a stash is a stash of *another* ledger. and
improves the staking `try-runtime` checks to catch potential issues with
the storage preemptively.

In summary, there are two important cases here:

1. **"Sane" double bonded ledger**

When a controller of a ledger is a stash of *another* ledger. In this
case, we have:

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, C) // B is also a stash of another ledger
(C, D)

> Ledger(controller)
Ledger(B) = L_a (stash = A)
Ledger(C) = L_b (stash = B)
Ledger(D) = L_c (stash = C)
```

In this case, the ledgers can be mutated and all operations are OK.
However, we should not allow `set_controller` to be called if it means
it results in a "corrupt" double bonded ledger (see below).

3. **"Corrupt" double bonded ledger**

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, B)
(C, D)
```
In this case, B is a stash and controller AND is corrupted, since B is
responsible for 2 ledgers which is not correct and will lead to
inconsistent states. Thus, in this case, in this PR we are preventing
these ledgers from mutating (i.e. operations like bonding extra etc)
until the ledger is brought back to a consistent state.

---

**Changes**:
- Checks if stash is already a controller when calling `Call::bond`
(fixes the regression introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850));
- Ensures that all fetching ledgers from storage are done through the
`StakingLedger` API;
- Ensures that -- when fetching a ledger from storage using the
`StakingLedger` API --, a `Error::BadState` is returned if the ledger
bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
`bond_extra`, `set_controller`, etc) its state and avoid further data
inconsistencies.
- Prevents stashes which are controllers or another ledger from calling
`set_controller`, since that may lead to a bad state.
- Adds further try-state runtime checks that check if there are ledgers
in a bad state based on their bonded metadata.

Related to #3245

---------

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
gpestana added a commit that referenced this issue Apr 11, 2024
Currently, the staking logic does not prevent a controller from becoming
a stash of *another* ledger (introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850)).
Given that the remaining of the code expects that never happens, bonding
a ledger with a stash that is a controller of another ledger may lead to
data inconsistencies and data losses in bonded ledgers. For more
detailed explanation of this issue:
https://hackmd.io/@gpestana/HJoBm2tqo/%2FTPdi28H7Qc2mNUqLSMn15w

In a nutshell, when fetching a ledger with a given controller, we may be
end up getting the wrong ledger which can lead to unexpected ledger
states.

This PR also ensures that `set_controller` does not lead to data
inconsistencies in the staking ledger and bonded storage in the case
when a controller of a stash is a stash of *another* ledger. and
improves the staking `try-runtime` checks to catch potential issues with
the storage preemptively.

In summary, there are two important cases here:

1. **"Sane" double bonded ledger**

When a controller of a ledger is a stash of *another* ledger. In this
case, we have:

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, C) // B is also a stash of another ledger
(C, D)

> Ledger(controller)
Ledger(B) = L_a (stash = A)
Ledger(C) = L_b (stash = B)
Ledger(D) = L_c (stash = C)
```

In this case, the ledgers can be mutated and all operations are OK.
However, we should not allow `set_controller` to be called if it means
it results in a "corrupt" double bonded ledger (see below).

3. **"Corrupt" double bonded ledger**

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, B)
(C, D)
```
In this case, B is a stash and controller AND is corrupted, since B is
responsible for 2 ledgers which is not correct and will lead to
inconsistent states. Thus, in this case, in this PR we are preventing
these ledgers from mutating (i.e. operations like bonding extra etc)
until the ledger is brought back to a consistent state.

---

**Changes**:
- Checks if stash is already a controller when calling `Call::bond`
(fixes the regression introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850));
- Ensures that all fetching ledgers from storage are done through the
`StakingLedger` API;
- Ensures that -- when fetching a ledger from storage using the
`StakingLedger` API --, a `Error::BadState` is returned if the ledger
bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
`bond_extra`, `set_controller`, etc) its state and avoid further data
inconsistencies.
- Prevents stashes which are controllers or another ledger from calling
`set_controller`, since that may lead to a bad state.
- Adds further try-state runtime checks that check if there are ledgers
in a bad state based on their bonded metadata.

Related to #3245

---------

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
gpestana added a commit that referenced this issue Apr 11, 2024
Currently, the staking logic does not prevent a controller from becoming
a stash of *another* ledger (introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850)).
Given that the remaining of the code expects that never happens, bonding
a ledger with a stash that is a controller of another ledger may lead to
data inconsistencies and data losses in bonded ledgers. For more
detailed explanation of this issue:
https://hackmd.io/@gpestana/HJoBm2tqo/%2FTPdi28H7Qc2mNUqLSMn15w

In a nutshell, when fetching a ledger with a given controller, we may be
end up getting the wrong ledger which can lead to unexpected ledger
states.

This PR also ensures that `set_controller` does not lead to data
inconsistencies in the staking ledger and bonded storage in the case
when a controller of a stash is a stash of *another* ledger. and
improves the staking `try-runtime` checks to catch potential issues with
the storage preemptively.

In summary, there are two important cases here:

1. **"Sane" double bonded ledger**

When a controller of a ledger is a stash of *another* ledger. In this
case, we have:

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, C) // B is also a stash of another ledger
(C, D)

> Ledger(controller)
Ledger(B) = L_a (stash = A)
Ledger(C) = L_b (stash = B)
Ledger(D) = L_c (stash = C)
```

In this case, the ledgers can be mutated and all operations are OK.
However, we should not allow `set_controller` to be called if it means
it results in a "corrupt" double bonded ledger (see below).

3. **"Corrupt" double bonded ledger**

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, B)
(C, D)
```
In this case, B is a stash and controller AND is corrupted, since B is
responsible for 2 ledgers which is not correct and will lead to
inconsistent states. Thus, in this case, in this PR we are preventing
these ledgers from mutating (i.e. operations like bonding extra etc)
until the ledger is brought back to a consistent state.

---

**Changes**:
- Checks if stash is already a controller when calling `Call::bond`
(fixes the regression introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850));
- Ensures that all fetching ledgers from storage are done through the
`StakingLedger` API;
- Ensures that -- when fetching a ledger from storage using the
`StakingLedger` API --, a `Error::BadState` is returned if the ledger
bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
`bond_extra`, `set_controller`, etc) its state and avoid further data
inconsistencies.
- Prevents stashes which are controllers or another ledger from calling
`set_controller`, since that may lead to a bad state.
- Adds further try-state runtime checks that check if there are ledgers
in a bad state based on their bonded metadata.

Related to #3245

---------

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
gpestana added a commit that referenced this issue Apr 11, 2024
Currently, the staking logic does not prevent a controller from becoming
a stash of *another* ledger (introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850)).
Given that the remaining of the code expects that never happens, bonding
a ledger with a stash that is a controller of another ledger may lead to
data inconsistencies and data losses in bonded ledgers. For more
detailed explanation of this issue:
https://hackmd.io/@gpestana/HJoBm2tqo/%2FTPdi28H7Qc2mNUqLSMn15w

In a nutshell, when fetching a ledger with a given controller, we may be
end up getting the wrong ledger which can lead to unexpected ledger
states.

This PR also ensures that `set_controller` does not lead to data
inconsistencies in the staking ledger and bonded storage in the case
when a controller of a stash is a stash of *another* ledger. and
improves the staking `try-runtime` checks to catch potential issues with
the storage preemptively.

In summary, there are two important cases here:

1. **"Sane" double bonded ledger**

When a controller of a ledger is a stash of *another* ledger. In this
case, we have:

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, C) // B is also a stash of another ledger
(C, D)

> Ledger(controller)
Ledger(B) = L_a (stash = A)
Ledger(C) = L_b (stash = B)
Ledger(D) = L_c (stash = C)
```

In this case, the ledgers can be mutated and all operations are OK.
However, we should not allow `set_controller` to be called if it means
it results in a "corrupt" double bonded ledger (see below).

3. **"Corrupt" double bonded ledger**

```
> Bonded(stash, controller)
(A, B)  // stash A with controller B
(B, B)
(C, D)
```
In this case, B is a stash and controller AND is corrupted, since B is
responsible for 2 ledgers which is not correct and will lead to
inconsistent states. Thus, in this case, in this PR we are preventing
these ledgers from mutating (i.e. operations like bonding extra etc)
until the ledger is brought back to a consistent state.

--- 

**Changes**:
- Checks if stash is already a controller when calling `Call::bond`
(fixes the regression introduced by [removing this
check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850));
- Ensures that all fetching ledgers from storage are done through the
`StakingLedger` API;
- Ensures that -- when fetching a ledger from storage using the
`StakingLedger` API --, a `Error::BadState` is returned if the ledger
bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
`bond_extra`, `set_controller`, etc) its state and avoid further data
inconsistencies.
- Prevents stashes which are controllers or another ledger from calling
`set_controller`, since that may lead to a bad state.
- Adds further try-state runtime checks that check if there are ledgers
in a bad state based on their bonded metadata.

Related to #3245

---------

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
gpestana added a commit that referenced this issue Apr 18, 2024
This backport PR should bump the `pallet-staking` from 30.0.1 to 30.0.2.

Backports for 1.8: 
- #3639

Relevant Issues:
- #3245

---------

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
Morganamilo pushed a commit that referenced this issue Apr 22, 2024
This backport PR should bump the `pallet-staking` from 28.0.0 to 28.0.1.

Backports for 1.6: 
- #3639

Relevant Issues:
- #3245

---------

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
Morganamilo pushed a commit that referenced this issue Apr 22, 2024
This backport PR should bump the `pallet-staking` from 27.0.0 to 27.0.1

Backports for 1.5: 
- #3639

Relevant Issues:
- #3245

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
Morganamilo pushed a commit that referenced this issue Apr 22, 2024
This backport PR should bump the `pallet-staking` from 26.0.1 to 26.0.2.

Backports for 1.4: 
- #3639

Relevant Issues:
- #3245

Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: kianenigma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T2-pallets This PR/Issue is related to a particular pallet. T10-tests This PR/Issue is related to tests.
Projects
Status: ✂️ In progress.
Development

No branches or pull requests

1 participant