Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] stake-snapshot of all pools #4570

Closed
17 of 22 tasks
Tracked by #4301
CarlosLopezDeLara opened this issue Oct 27, 2022 · 9 comments
Closed
17 of 22 tasks
Tracked by #4301

[FR] stake-snapshot of all pools #4570

CarlosLopezDeLara opened this issue Oct 27, 2022 · 9 comments
Assignees
Labels
comp: cardano-cli type: enhancement An improvement on the existing functionality type: internal feature Non user-facing functionality user type: internal Created by an IOG employee

Comments

@CarlosLopezDeLara
Copy link
Contributor

CarlosLopezDeLara commented Oct 27, 2022

What

Add the ability to query for multiple stake pools, no stake pools or all stake pools in the query stake-pool command.

Previously it was only possible to query for one specific stake pool:

$ cardano-cli query stake-snapshot --cardano-mode --mainnet --stake-pool-id fc79a939d5986c6565fc5e21bb5dd292261ac5da564c30385b3bb8d4
{
    "activeStakeGo": 24453012636354912,
    "activeStakeMark": 24525857119151547,
    "activeStakeSet": 24463445965412728,
    "poolStakeGo": 8595693051,
    "poolStakeMark": 8595693051,
    "poolStakeSet": 8595693051
 }

The change introduces a pools object to accomodate more than one stake pool in the result like this and thus is a breaking change:

$ cardano-cli query stake-snapshot --cardano-mode --mainnet --stake-pool-id fc79a939d5986c6565fc5e21bb5dd292261ac5da564c30385b3bb8d4
{
    "pools": {
        "fc79a939d5986c6565fc5e21bb5dd292261ac5da564c30385b3bb8d4": {
            "stakeGo": 8595693051,
            "stakeMark": 8595693051,
            "stakeSet": 8595693051
        }
    },
    "total": {
        "stakeGo": 24453012636354912,
        "stakeMark": 24525857119151547,
        "stakeSet": 24463445965412728
    }
}

Then querying for multiple stack pools can be done by adding more --stake-pool-id switches, for example:

$ cardano-cli query stake-snapshot --cardano-mode --mainnet --stake-pool-id fc79a939d5986c6565fc5e21bb5dd292261ac5da564c30385b3bb8d4 --stake-pool-id 0dedea6dd97bfd6b0d3ddf8e42375266ade6ec6a9c1aceb8efeecdab
{
    "pools": {
        "0dedea6dd97bfd6b0d3ddf8e42375266ade6ec6a9c1aceb8efeecdab": {
            "stakeGo": 547234244,
            "stakeMark": 547234244,
            "stakeSet": 547234244
        },
        "fc79a939d5986c6565fc5e21bb5dd292261ac5da564c30385b3bb8d4": {
            "stakeGo": 8595693051,
            "stakeMark": 8595693051,
            "stakeSet": 8595693051
        }
    },
    "total": {
        "stakeGo": 24453012636354912,
        "stakeMark": 24525857119151547,
        "stakeSet": 24463445965412728
    }
}

Alternatively, no --stake-pool-id options can be supplied in which case only the active stake is returned:

$ cardano-cli query stake-snapshot --cardano-mode --mainnet
{
    "total": {
        "stakeGo": 24453012636354912,
        "stakeMark": 24525857119151547,
        "stakeSet": 24463445965412728
    }
}

It will also be possible to query for all stake pools by supplying --all-stake-pools flag:

$ cardano-cli query stake-snapshot --cardano-mode --mainnet --all-pools
{
    "pools": {
        "0dedea6dd97bfd6b0d3ddf8e42375266ade6ec6a9c1aceb8efeecdab": {
            "stakeGo": 547234244,
            "stakeMark": 547234244,
            "stakeSet": 547234244
        },
        ... ... ...
        "fc79a939d5986c6565fc5e21bb5dd292261ac5da564c30385b3bb8d4": {
            "stakeGo": 8595693051,
            "stakeMark": 8595693051,
            "stakeSet": 8595693051
        }
    },
    "total": {
        "stakeGo": 24453012636354912,
        "stakeMark": 24525857119151547,
        "stakeSet": 24463445965412728
    }
}

Why

Mithril needs the "set" stake distribution, ie. the one that's valid for the current epoch, to sign it as part of its chain of trust. There's an existing stake distribution command but:

  1. it returns the "go" stake distribution which is updated on the fly with each block/tx
  2. it returns a double value (numerator denominator) instead of end result.
    We are currently using the stake-pool command which returns the exact s for each SP, but unfortunately this means we have to run the command for each SP which of course does not scale very well.

Personas - Who will this affect?

  • SPO
  • Dapp Devs --> mithril
  • Exchanges

Acceptance Criteria

  • Command does not download the entire ledger state
  • Output 3 last snapshots aka mark, set, go
  • Output the EXACT stake in lovelace for each pool (not a percentage or truncated value)
  • Output is JSON to stdout
  • --out-file in JSON
  • Meaningful help message
  • the command does not affect the running node performance
  • the command does not kill the machine/processes by consuming all the memory

Definition of done

  • Code review
  • Builds successfully on CI
  • Includes documentation and/or examples for the functionality
  • Log/record changes on Vnext (or similar depending on what we adopt)
  • Ticket number included in PR description
  • The new functionality is covered by dev/unit/property tests
  • Acceptance Criteria met
  • Test Engineering Sign-off + E2E automated tests

Sign-off

  • Product Owner
  • Dev Owner
  • System Test Owner

Related PRs

@CarlosLopezDeLara CarlosLopezDeLara added user type: internal Created by an IOG employee comp: cardano-cli type: enhancement An improvement on the existing functionality labels Oct 27, 2022
@ghost
Copy link

ghost commented Oct 31, 2022

Basic acceptance criteria:

Given ledger is at block height H 
And stake distribution for mark, set and go is M, S, and G respectively
When querying `--all-stake-pools` distribution 
Then user should retrieve a JSON object mapping stake pool ids to stake value for mark, set, go of this SP expressed in lovelaces
And the total stake

With --out-file F:

Then user should have a file F containing the same output

From an end-user perspective we don't really care if it downloads the ledger state or not :) But it's critically important the output contains the exact stake for each pool and not a percentage or truncated value.

@jpraynaud WDYT?

@jpraynaud
Copy link

Absolutely, we need stake values in lovelaces for all pool ids, and for mark, set and go snapshots.

The cardano-cli query stake-snapshot --stake-pool-id **POOL_ID** command currently outputs:

{
    "poolStakeGo": 1000000,
    "poolStakeSet": 2000000,
    "poolStakeMark": 3000000,
    "activeStakeGo": 5000000,
    "activeStakeSet": 7000000,
    "activeStakeMark": 9000000
}

Maybe the cardano-cli query stake-snapshot --all-stake-pools command could return something like:

{
    "pools":{
        "poolid1": {
            "stakeGo": 1000000,
            "stakeSet": 2000000,
            "stakeMark": 3000000,
        },
        "poolid2": {
            "stakeGo": 6000000,
            "stakeSet": 7000000,
            "stakeMark": 8000000,
        },
        ...
    },
    "total":{
        "stakeGo": 50000000,
        "stakeSet": 70000000,
        "stakeMark": 90000000
    }
}

or

{
    "stakeGo":{
        "poolid1": 1000000,
        "poolid2": 6000000,
        ...
    },
    "stakeSet":{
        "poolid1": 2000000,
        "poolid2": 7000000,
        ...
    },
    "stakeMark":{
        "poolid1": 3000000,
        "poolid2": 8000000,
        ...
    },
    "total":{
        "stakeGo": 50000000,
        "stakeSet": 70000000,
        "stakeMark": 90000000
    }
}

Note: Mithril is currently using the mark value as it is the most recent stake value that does not evolve during the current epoch. We were previously working with the cardano-cli query stake-distribution command that makes computation from the stakes evolving during the current epoch: this prevented us from retrieving deterministic results from different nodes.

@CarlosLopezDeLara
Copy link
Contributor Author

From an end-user perspective we don't really care if it downloads the ledger state or not :) But it's critically important the output contains the exact stake for each pool and not a percentage or truncated value.

@abailly-iohk Using the ledger state might be irrelevant to the end user. But I guess performance is not. This will not give a good performance (can even OOM your machine) if this depends on getting the ledger state, thus I'd like to avoid it an make it an acceptance criteria

@ghost
Copy link

ghost commented Nov 3, 2022

Not too sure about the performance criterion (acceptable latency) but of course it should not crash on mainnet, even if it scales to several 1000s of SPOs. The truth is we all know this information is trivial to obtain from the ledger because it's maintained as part of the LedgerState. Interestingly, if this was stored on disk into a known format then we could directly query it without having to bother the nodes with requesting this information.
So there's definitely a lot of ways this feature could be implemented. One could even imagine that it's not the business of the nodes to answer these queries and require some other service/tool to do that job.

@CarlosLopezDeLara CarlosLopezDeLara moved this from 📋 Backlog (needs grooming) to 🔖 Sprint backlog in Node CLI/API 2022 Nov 4, 2022
@dorin100
Copy link

dorin100 commented Nov 7, 2022

Related to Acceptance Criteria, I am looking to see some values around the Performance of this new cli command, like:

  • what is the max memory that can be used?
  • what is the max time that can take to have an output?
  • if it takes a longer time to generate the output (maybe minutes!), is it ok to block the terminal or maybe we should display a message?

One clear requirement I would make would be to make sure the machine does not OOM and that the node that is running on the same machine is not affected.

@jpraynaud
Copy link

Regarding the performance, we could probably use the cardano-cli query stake-distribution command as a reference which does almost the same job, but at a different epoch, only for this epoch and that does not provide the total stakes in the computed results.

Assuming that computations are done sequentially for the go, set and mark snapshots and that the size of the computed stake distribution itself is almost negligible:

  • Approximately 3 times slower
  • Approximately same memory usage

@github-actions
Copy link

github-actions bot commented Dec 9, 2022

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 120 days.

@ghost
Copy link

ghost commented Jan 15, 2023

Here's what the current situation is and a measure of the pain this feature will alleviate:

Screenshot 2023-01-15 at 10 18 22

@CarlosLopezDeLara
Copy link
Contributor Author

Implemented by #4570

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp: cardano-cli type: enhancement An improvement on the existing functionality type: internal feature Non user-facing functionality user type: internal Created by an IOG employee
Projects
None yet
Development

No branches or pull requests

4 participants