Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CIP-0078? | Extended Local Chain Sync Protocol #375
CIP-0078? | Extended Local Chain Sync Protocol #375
Changes from 2 commits
c39e55c
b6d040f
46e774a
284b018
056a1f9
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's little use-case for the UTxO state or the protocol parameters because these are straightforward to obtain from the chain-sync protocol itself.
However, anything that regards rewards is indeed a pain in the *** to work with; because rewards are implicit in the protocol. I'd love to see a quick justification here (instead of "basically" 😅 ...) stating just that. There's no way for a client application to provide reliable information on rewards without maintaining the entire ledger state; for one needs to know what everyone owns in order to calculate rewards snapshots on each era.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure that the protocol parameters are not available on chain, only the voting for the changes. Yes, the ledger rules can be applied to whatever voting is seen on chain but that is way more fragile than getting the parameters from the source.
Agree, that nobody in their right mind would want the UTxO state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the protocol parameters are available on chain,
NewEpochState
->EpochState
->PParams
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be really nice to have a function to pull the protocol parameters out of ledger state so that I did not have to go digging around inside it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're in luck, I have one right here:
nesEs . esPp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call that "digging into the internals".
I am asking for an officially maintained function that is part of an official API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lack of a quality api is indeed an issue, but a much bigger issues imo, is the need to replicate resources outside the node, especially for ledger state data. An api cannot help there, this requires improvements on the protocol level. Api and protocol are two different topics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative solution that won't solve the duplicate use of RAM but would solve the duplicate use of code would be to pack consensus + ledger into a "Slim node" that works in read-only mode with an existing ChainDB, that would be packaged and built along side the node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then db-sync or any client could use that "ghost node" to replay blocks from any point in time, generate events, query ledger state and what not. I am even surprised such a tool does not already exist: Is this not somewhat akin to what db-analyser do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this idea be extended even more to solve the duplicate RAM problem? What if db-sync maintained its own ChainDB, connected to peers itself, ran ouroboros etc? That way it doesn't need a separate trusted node process, there is no replication, no need for protocol extension, deployment is greatly simplified etc. At the same time this adds no complexity to the node that @dcoutts understandably wants to avoid, as this complexity will live in the db-sync repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this is an idea that popped into my head recently: Make the full node's logic available as a library instead of only as an executable. Then db-sync could just embed the node's logic, and provided there's some way to plug "decorators" here and there, could just tap on node processing stuff, or poke at the current ledger state when it needs, etc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative -- and straightforward -- solution would be to also simply "dump" information required to the calculation of implicit stuff (such as rewards) into some file as the node processes them. And either, document the format of that file or provide a thin api-layer to query its content.
Having just that would solve many problems down the line; client applications will almost always be connected to a local node and that node will have to do this work anyway. So instead of duplicating that work; simply record the information when its done and gives way to access that information later on. It could be a mere .csv or .json file recording the stake distribution on each epoch transition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, ok, so you would not need to run a node on a machine running
db-sync
at all. Not sure how this compares to mu proposal in terms of amount of work or complexity.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably (much) more work so I'd not consider this as an alternative for this particular CIP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny, in the last company I worked for, everything was written as a library first (makes testing much easier) and then thin wrapper was added to make it into an executable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this similar to what Santiago from TxPipe proposed a while back called Dolos, a Data Node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, I wasn't there a while back ;) But I guess this idea is pretty much obvious and not novel so I am pretty sure other people have thought of iti.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has always been said to be "impossible" because the ledger does not record past events and wouldn't be able to replay any of those events that are too far in the past.
Wouldn't it make sense to start by recording the events somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is effectively part of this proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this "Scientia program" about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a proposal put forward by John Woods during his temporary tenure at IOG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KtorZ The output document isn't public, but it was a user research report of what chain indexers exist, which and why people use, pros, cons etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it isn't public, I don't see why it is mentioned here 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we specify that the user may wish to request one of the three (labeled
mark
,set
,go
) snap shots? Is the live, up-to-date distribution ever needed (though note that this one changes block by block)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think yes. It is very useful to request only 1 of mark, set, go. @papacarp at pooltool needs
mark
. At dripdropz, I need just thego
snapshot.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
db-sync extracts slices of ~2000 per block from the
set
snashot starting from the epoch boundary. The size may be bigger based on the expected number of blocks in the epoch.https://github.com/input-output-hk/cardano-db-sync/blob/master/cardano-db-sync/src/Cardano/DbSync/Era/Shelley/Generic/StakeDist.hs#L44
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of simpilicity, I really think this should be push only from the
node
to the consumer..The live up-to-date distribution is not currently needed by
db-sync
.As for which one (and making sure I have this correct,
mark
is the stake distributions that will be used incurrent_epoch + 2
,set
incurrent_epoch + 1
andset
in thecurrent_epoch
). My understanding is thatset
is not guaranteed stable in the first part of an epoch, but I would suggest this is the one we want, but only as soon as it is stable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need both mark and set in mithril for example. Different use cases will require different sets so providing all of them would just make things simpler because AFAIK they are maintained together in the Ledger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slight correction here:
mark = current_epoch + 1 (this is next epochs stake distribution)
set = current_epoch (this is the stake distribution being used CURRENTLY to make blocks)
go = current_epoch - 1 (this is used for reward calculations)
mark snapshot would not be guaranteed until after the stability window I presume but it would be good to pull whenever the consumer wants as each application will have different accuracy requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, @papacarp, about half the time I look at the
mark|set|go
terminology I get it wrong :/.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we specify an event on the epoch boundary that lists the stake credentials which appeared in the reward calculation, but were de-registered when the reward were given out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure on this one. @erikd would know if db-sync needs it in that format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this exists already. It's called RestrainedRewards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, @erikd and @kderme depend on the ledger events, and indeed they do use an event to sort this problem out: https://github.com/input-output-hk/cardano-ledger/blob/fd806798eb68e904435720dbcea47def27c1a0fc/eras/shelley/impl/src/Cardano/Ledger/Shelley/Rules/NewEpoch.hs#L264-L267
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there currently is an event like this and we need to keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A possible implementation first step would be to write a prototype funnelling existing events (or only one of them for simplicity's sake because I assume there would be need to implement serialisation) through the consensus code down to the network stack, in order to properly understand the impact of this change.