Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAD-4157: Stash AVVM to allow on-disk UTxO. #2728

Merged
merged 2 commits into from
Apr 20, 2022
Merged

CAD-4157: Stash AVVM to allow on-disk UTxO. #2728

merged 2 commits into from
Apr 20, 2022

Conversation

nc6
Copy link
Contributor

@nc6 nc6 commented Apr 13, 2022

This commit provides ledger support for the transition of UTxO to
on-disk storage. In particular, the translation between Shelley and
Allegra is currently problematic, owing to the need to remove AVVM
addresses (dormant since Byron) from the UTxO. This requires a linear
scan of the UTxO, which is now stored on disk and hence cannot/should
not be accessed in such a way.

The solution to this is a bit of a hack; since the Byron UTxO are not
stored on disk, and since AVVM addresses remain untouched during the
Shelley era, we instead perform the scan at the Byron/Shelley boundary,
store the set of UTxO in the ledger state, only to yield them to the
consensus layer at the Shelley/Allegra boundary. Thus provided with the
set of addresses, consensus passes them back as the UTxO, allowing the
translation function to delete them.

This change is written such that, with no on-disk UTxO, the ledger
functionality remains identical, other than a slight bump in memory
usage during the Shelley era.

As far as possible, this commit minimises ledger changes. Thus, a
pattern is introduced replicating the existing NewEpochState
constructor. Unfortunately, there are two knock-on effects related to
weaknesses in pattern synonyms:

  • A couple of patterns matching on NewEpochState now seem to be
    failable. Since they cannot fail, we instead make them irrefutable.
  • Record fields associated with pattern synonyms do not play nicely with
    NamedFieldPuns. As such, we replace a couple of puns with explicit
    bindings.

A single additional function shelleyToAllegraAVVMsToDelete is exported
for the use of the consensus layer. Otherwise no API changes are made.

This does entail a change in the ledger state serialisation format.

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm Requesting Changes, because I think I noticed and commented on a bug.

@nc6 nc6 force-pushed the nc/avvm-hd branch 2 times, most recently from f96c73d to 03dbbd0 Compare April 13, 2022 14:01
Copy link
Collaborator

@lehins lehins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not too excited about another pattern synonym, but it does get the job done.

Other than that there are couple minor suggestions.

This commit provides ledger support for the transition of UTxO to
on-disk storage. In particular, the translation between Shelley and
Allegra is currently problematic, owing to the need to remove AVVM
addresses (dormant since Byron) from the UTxO. This requires a linear
scan of the UTxO, which is now stored on disk and hence cannot/should
not be accessed in such a way.

The solution to this is a bit of a hack; since the Byron UTxO are not
stored on disk, and since AVVM addresses remain untouched during the
Shelley era, we instead perform the scan at the Byron/Shelley boundary,
store the set of UTxO in the ledger state, only to yield them to the
consensus layer at the Shelley/Allegra boundary. Thus provided with the
set of addresses, consensus passes them back as the UTxO, allowing the
translation function to delete them.

This change is written such that, with no on-disk UTxO, the ledger
functionality remains identical, other than a slight bump in memory
usage during the Shelley era.

As far as possible, this commit minimises ledger changes. Thus, a
pattern is introduced replicating the existing `NewEpochState`
constructor. Unfortunately, there are two knock-on effects related to
weaknesses in pattern synonyms:

- A couple of patterns matching on `NewEpochState` now seem to be
  failable. Since they cannot fail, we instead make them irrefutable.
- Record fields associated with pattern synonyms do not play nicely with
  `NamedFieldPuns`. As such, we replace a couple of puns with explicit
  bindings.

A single additional function `shelleyToAllegraAVVMsToDelete` is exported
for the use of the consensus layer. Otherwise no API changes are made.

This does entail a change in the ledger state serialisation format.
@nc6
Copy link
Contributor Author

nc6 commented Apr 14, 2022

So, I like both of these things in theory, but having tried to put them into practice they both cause sufficient enough disruption that I don't want to introduce them.

  • Having a closed type family is really nice, but it introduces the issue that there can be no default constructed value. We have a couple of places where this is a problem:

    • In the initial rules, which we still use for NEWEPOCH, and
    • In the bidirectional pattern, meaning that choosing the closed type family implies dropping the bidirectional pattern, as below.
  • Dropping the bidirectional pattern forces us to either have the record syntax for pattern matching or updates, but not both. This to me is too high a price to pay. I'm not overly concerned by the calibre of the footgun here - the worst result a mistake here can cause is that we fall over when validating the Shelley/Allegra boundary, something we are forced to catch well before release.

I think the only palatable alternative would be Alexey's original suggestion, to drop the pattern altogether and force everyone to deal with the new field. For discussion, I'll push that as a separate commit.

@nc6 nc6 force-pushed the nc/avvm-hd branch 2 times, most recently from e0fae79 to 8de6cee Compare April 14, 2022 11:03
This also drops the use of a pattern synonym to hide the
StashedAVVMAddresses from most uses.
@JaredCorduan
Copy link
Contributor

@nc6 I think we should do something to account for the fact that we've just created a Seventh Pot of ADA. I bet that the property tests did not notice this since we never generate AVVM addresses in testing.

I can think of two reasonable options to fix totalAdaPotsES.

https://github.com/input-output-hk/cardano-ledger/blob/1db68a3ec0a2dcb5751004beb22b906162474f23/eras/shelley/impl/src/Cardano/Ledger/Shelley/API/Wallet.hs#L561-L589

One option is to add a seventh pot. The other option is to include stashedAVVMAddresses in the UTxO pot.

We might also want to add some prose about this in the errata in the Shelley spec, specifically in the section about this particular wart.

@nc6
Copy link
Contributor Author

nc6 commented Apr 19, 2022

@JaredCorduan I'm afraid I don't understand your comments (or fear that they are based on a misconception of the work here). Whilst we did discuss a "seventh pot" approach, that is not what is implemented here. Consider in particular https://github.com/input-output-hk/cardano-ledger/pull/2728/files#diff-b403883c359c9c3f41b927c964599024c8c6f3a10cf70027835ec3eb44b1a4f6L100 - you will see that the UTxO is not modified on the Byron/Shelley boundary. The only thing that happens is that a new field - stashedAVVMAddresses is populated. This field is never read or modified until the Shelley/Allegra translation, where it is deleted. From a ledger semantics perspective, this PR is a complete no-op - we put something somewhere, never look at it, and then delete it. UTxO, reserves etc remain entirely as they were before.

It is only by the action of consensus that this PR has any effect, and even then, it is simply to ensure the fundamental property of UTxO HD - that the ledger should behave identically on the full UTxO and on the view of the UTxO presented by the consensus layer. In this case, by ensuring that on the Shelley/Allegra boundary, the UTxO contains the AVVM addresses such that they can be deleted.

Ideally, we could emphasize this by only storing the TxId in stashedAVVMAddresses, but unfortunately the consensus layer (currently) cannot perform lookup of the UTxO when doing translation, so this isn't possible.

Copy link
Contributor

@JaredCorduan JaredCorduan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks right to me! Thank much.

The PR description is out of date: eg I don't see any new pattern synonym in this latest diff.

I also made a couple minor comments below.

) =>
ToCBOR (NewEpochState era)
where
toCBOR (NewEpochState e bp bc es ru pd) =
encodeListLen 6 <> toCBOR e <> toCBOR bp <> toCBOR bc <> toCBOR es
toCBOR (NewEpochState e bp bc es ru pd av) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is going to be released before UTxO HD, then we don't need to make the serialization backwards incompatible. We could just deserialize a ListLen 6 as having an empty map, and a hypothetical non-yet-UTxO-HD Consensus layer downstream wouldn't notice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could only change the serialization for eras for which this is not (). That way today's users would never have to reapply from the genesis block. (My goal is just to avoid an unnecessary long replay for users, but that's ultimately for a different decision maker to balance.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that, but there are already serialisation changes in the ledger which will be visible in the next release, so there's not much point.

@@ -311,7 +313,8 @@ exampleNewEpochState value ppp pp =
nesBcur = BlocksMade (Map.singleton (mkKeyHash 2) 3),
nesEs = epochState,
nesRu = SJust rewardUpdate,
nesPd = examplePoolDistr
nesPd = examplePoolDistr,
stashedAVVMAddresses = def
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worthwhile to make this non-empty for the Shelley era?

@nfrisby
Copy link
Contributor

nfrisby commented Apr 19, 2022

I think the only palatable alternative would be Alexey's original suggestion, to drop the pattern altogether and force everyone to deal with the new field. For discussion, I'll push that as a separate commit.

Looks like this is what you all went with? It's what I see in the diff.

@nc6 nc6 dismissed nfrisby’s stale review April 20, 2022 06:02

Changes made

@nc6 nc6 merged commit 2d68cef into master Apr 20, 2022
@iohk-bors iohk-bors bot deleted the nc/avvm-hd branch April 20, 2022 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants