Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-Asset Coin Selection #2450

Merged
merged 16 commits into from
Jan 20, 2021
Merged

Conversation

jonathanknowles
Copy link
Member

@jonathanknowles jonathanknowles commented Jan 14, 2021

Issue Number

ADP-605

Overview

This PR implements the Random-Round-Robin coin selection algorithm for multi-asset UTxO sets.

The Random-Round-Robin algorithm is inspired by the Random-Improve algorithm, but with some differences to accommodate UTxO sets with multiple assets.

Top-Level Algorithm Description

The Random-Round-Robin algorithm considers the sum of all outputs collectively, selecting inputs to cover the total sum of all asset quantities, rather than running a separate selection for each input. It therefore drops the cardinality restriction of the original Random-Improve algorithm implementation.

Steps

  1. When selecting inputs, consider tokens in round-robin fashion, selecting one input per token before moving to the next token, to reduce the chance of over-selecting for any particular token.
  2. For each token quantity under consideration, select enough inputs to cover at least 100% of that quantity, but aim to get as close to a target of 200% as possible. When we can make no further improvement for a given token, we eliminate that token from the round-robin selection process. An improvement is defined as an additional selection that takes the total selected token quantity closer to 200% of the output token quantity, but not further away.
  3. The round-robin selection phase terminates when we can make no further improvement for any token in the set under consideration.
  4. After the selection phase is over, divide any excess token quantities (inputs − outputs) into change bundles, where:
    • there is exactly one change bundle for each output.
    • the quantity of a given token in a change bundle is proportional to the quantity of that token in the corresponding output (modulo rounding).
    • the total quantity of a given token across all change bundles is equal to the total excess quantity of that token.
  5. Redistribute additionally-selected tokens not present in the original outputs to the change bundles, where:
    • if there are fewer quantities for a given token than the number of change bundles, include these quantities without changing them.
    • if there are more quantities for a given token than the number of change bundles, repeatedly coalesce the smallest pair of quantities together until the total number of quantities is exactly equal to the number of change bundles.

@jonathanknowles jonathanknowles self-assigned this Jan 14, 2021
@KtorZ KtorZ self-requested a review January 14, 2021 14:59
@jonathanknowles jonathanknowles force-pushed the jonathanknowles/multi-asset-coin-selection branch 5 times, most recently from 03b950f to 0879623 Compare January 15, 2021 05:31
round :: (RealFrac a, Integral b) => RoundingDirection -> a -> b
round = \case
RoundUp -> ceiling
RoundDown -> floor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be frank, I am very doubtful about this module 😬 ... I do think ceiling and floor are so common across all programming languages that there's no possible ambiguity about what those functions do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be frank, I am very doubtful about this module grimacing ... I do think ceiling and floor are so common across all programming languages that there's no possible ambiguity about what those functions do.

Good point. These functions could easily be moved into the Util module, as they're currently only used by the partitionNatural function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KtorZ Fixed in this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that my main concern wasn't much about why this is in a separate module, but why this has to be defined altogether :). I think that any places that uses round RoundUp or round RoundDown could simply use ceiling and floor. That latter option is actually clearer and more idiomatic. There's no need to build an abstraction over it IMO.

Copy link
Member Author

@jonathanknowles jonathanknowles Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that any places that uses round RoundUp or round RoundDown could simply use ceiling and floor.

Very true.

However, my original goal here was simply to make this code more self-documenting: (taken from partitionNatural)

roundings :: NonEmpty RoundingDirection
roundings =
    applyN shortfall (NE.cons RoundUp) (NE.repeat RoundDown)
  where
    shortfall
        = fromIntegral target
        - fromIntegral @Integer
            (F.sum $ round RoundDown <$> portionsUnrounded)

IMO, the purpose of roundings is much clearer and more self-documenting with the RoundingDirection type. It's a non-empty list of rounding directions (either up or down).

One alternative would be to write something like this:

roundings :: (RealFrac a, Integral b) => NonEmpty (a -> b)

But this makes the intent much less obvious IMO, and would probably require an additional comment to explain what's going on.

=> NonEmpty m
-- ^ Source list
-> NonEmpty a
-- ^ Target list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really have to be a list? It seems that we only really want a length here. The examples make it every more awkward for they all use replicate n () 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really have to be a list?

It definitely doesn't have to be a list!

However, accepting a non-empty list makes it nice to re-use from functions that already have a non-empty list of the target size.

For example:

changeForSurplusAssets calls makeChangeForSurplusAssets with the already-existing outputBundles, which calls partitionTokenQuantity, which eventually calls partitionNatural. So we end up coalescing (or padding) the surplus asset quantities into exactly n chunks, where n is number of output bundles.

As a bonus, using NonEmpty here allows us to make this function total, as the minimum number of elements in a non-empty list is always one.

If we accepted an Int (or something else), then presumably we'd need to choose from one of the following non-ideal options:

  • make the function return Maybe if the target length is < 1 (in which case the caller would need to deal with the awkward Nothing value that they know they shouldn't receive).
  • make the function throw an error if the target length is < 1.
  • do something else, like taking max 1 targetLength. But if the caller supplies a targetLength less than 1, that presumably represents a programming error that we'd want to know about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a bonus, using NonEmpty here allows us to make this function total, as the minimum number of elements in a non-empty list is always one.

Fair enough. That's a good argument.

-- 3. The size of each element in the resulting list is within unity of the
-- ideal proportion.
--
partitionNatural
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding, partition has the sense of division. It feels a bit weird to me in this context, wouldn't distribute better capture the behavior 🤔 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding, partition has the sense of division. It feels a bit weird to me in this context, wouldn't distribute better capture the behavior thinking ?

In this case, the term "partition" is taken from number theory:

https://en.wikipedia.org/wiki/Partition_(number_theory)

Quote:

In number theory and combinatorics, a partition of a positive integer n, also called an integer partition, is a way of writing n as a sum of positive integers.

Admittedly, in our particular case, we are computing a special kind of partition, where the elements of the partition are proportional to the sizes of the specified weights (modulo rounding).

The term "distribute" would probably also work here, but IMO this name doesn't capture (as nicely) the property that the total sum is preserved (whereas the term "partition" implies this, if we assume the definition above).

@jonathanknowles jonathanknowles force-pushed the jonathanknowles/multi-asset-coin-selection branch 6 times, most recently from 766999e to 78dc983 Compare January 15, 2021 15:13
@jonathanknowles jonathanknowles added the ADDING FEATURE Mark a PR as adding a new feature, for auto-generated CHANGELOG label Jan 18, 2021
@jonathanknowles jonathanknowles force-pushed the jonathanknowles/multi-asset-coin-selection branch 2 times, most recently from c59cc78 to b2ddc5a Compare January 18, 2021 06:15
@jonathanknowles jonathanknowles force-pushed the jonathanknowles/multi-asset-coin-selection branch 2 times, most recently from 25989d9 to ceb777a Compare January 20, 2021 09:07
mkResult SelectionState {selected, leftover}
| not (balanceRequired `leq` balanceSelected) =
Left $ SelectionInsufficient $ SelectionInsufficientError
{inputsSelected = UTxOIndex.toList selected, balanceRequired}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach. We could have callers to performSelection possibly retry on SelectionInsufficient errors. If we're close to the required balance, the selection may still succeed with a different selection of inputs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have callers to performSelection possibly retry on SelectionInsufficient errors. If we're close to the required balance, the selection may still succeed with a different selection of inputs.

I think that's a great idea. 👍🏻

This function partitions a natural number into a number of parts, where
the size of each part is proportional to the size of its corresponding
element in the given list of weights, and the number of parts is equal
to the number of weights.
This function adjusts the source list so that its length is the same as
the target list, either by padding the list, or by coalescing a subset
of the elements, while preserving the total sum.
This change adds the following convenience functions:

- `fromTokenMap`
- `fromTokenQuantity`
Function `genTxInLargeRange` generates transaction inputs chosen from a
large range, to minimize the possibility of collisions.
Function `genUTxOIndexLarge` generates large UTxO indices.
This top-level function performs a complete coin selection and generates
coin bundles in one step.
This change forks `genTokenBundleSmallRange` into two variants:

  - `genTokenBundleSmallRange`
     Generates token bundles where the ada quantity may be zero.

  - `genTokenBundleSmallRangePositive`
     Generates token bundles where the ada quantity is always non-zero.

This is necessary, as some QC properties require token bundles with ada
quantities of zero.

But coin selection QC properties typically require token bundles (within
transaction outputs) to have non-zero ada quantities.

This change also forks the associated shrinker function
`shrinkTokenBundleSmallRange` in a similar fashion.
In real life, we'll always see transaction outputs with non-zero ada
quantities.

The coin selection algorithm also expects this.
It's important to be able to protect the wallet from computations that
are excessively costly.

This change introduces the `SelectionLimit` type, which makes it possible to
limit the scope of a coin selection computation by specifying an upper limit
on the number of inputs that can be selected.

With this change, the coin selection algorithm will terminate when the limit
is reached, and not make any further selections.

If, during a selection run, the limit is reached, there are two main cases:

  - If the current selection already satisfies the minimum required
    balance, `performSelection` will return the current selection.

  - If the current selection does not satisfy the minimum required
    balance, `performSelection` will return a `SelectionInsufficient`
    error.
@jonathanknowles jonathanknowles force-pushed the jonathanknowles/multi-asset-coin-selection branch from ceb777a to ce9ca1e Compare January 20, 2021 11:10
@jonathanknowles
Copy link
Member Author

bors r+

iohk-bors bot added a commit that referenced this pull request Jan 20, 2021
2450: Multi-Asset Coin Selection r=jonathanknowles a=jonathanknowles

# Issue Number

ADP-605

## Overview

This PR implements the **Random-Round-Robin** coin selection algorithm for multi-asset UTxO sets.

The **Random-Round-Robin** algorithm is inspired by the **Random-Improve** algorithm, but with some differences to accommodate UTxO sets with multiple assets.

## Top-Level Algorithm Description

The **Random-Round-Robin** algorithm considers the sum of all outputs **collectively**, selecting inputs to cover the total sum of all asset quantities, rather than running a separate selection for each input. It therefore drops the cardinality restriction of the original Random-Improve algorithm implementation.

### Steps

1. When selecting inputs, consider tokens in **round-robin fashion**, selecting one input per token before moving to the next token, to reduce the chance of over-selecting for any particular token.
2. For each token quantity under consideration, select enough inputs to cover at least 100% of that quantity, but aim to get as close to a target of 200% as possible. When we can make no further improvement for a given token, we eliminate that token from the round-robin selection process. An _improvement_ is defined as an additional selection that takes the total selected token quantity closer to 200% of the output token quantity, but not further away.
3. The round-robin selection phase terminates when we can make no further improvement for any token in the set under consideration.
4. After the selection phase is over, divide any excess token quantities (inputs − outputs) into change bundles, where:
    -  there is exactly one change bundle for each output.
    - the quantity of a given token in a change bundle is proportional to the quantity of that token in the corresponding output (modulo rounding).
    - the total quantity of a given token across all change bundles is equal to the total excess quantity of that token.
5. Redistribute additionally-selected tokens not present in the original outputs to the change bundles, where:
    - if there are fewer quantities for a given token than the number of change bundles, include these quantities without changing them.
    - if there are more quantities for a given token than the number of change bundles, repeatedly coalesce the smallest pair of quantities together until the total number of quantities is exactly equal to the number of change bundles.

Co-authored-by: Jonathan Knowles <[email protected]>
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Jan 20, 2021

Build failed:

Failures:

  src/Test/Integration/Scenario/API/Shelley/StakePools.hs:576:5: 
  1) API Specifications, SHELLEY_STAKE_POOLS, STAKE_POOLS_JOIN_04 - Rewards accumulate
       uncaught exception: RequestException
       DecodeFailure "{\"code\":\"network_query_failed\",\"message\":\"Unable to query the ledger at the moment. This error has been logged. Trying again in a bit might work.\"}"

  To rerun use: --match "/API Specifications/SHELLEY_STAKE_POOLS/STAKE_POOLS_JOIN_04 - Rewards accumulate/"

Randomized with seed 1981313050

#2320

@jonathanknowles
Copy link
Member Author

bors r+

iohk-bors bot added a commit that referenced this pull request Jan 20, 2021
2450: Multi-Asset Coin Selection r=jonathanknowles a=jonathanknowles

# Issue Number

ADP-605

## Overview

This PR implements the **Random-Round-Robin** coin selection algorithm for multi-asset UTxO sets.

The **Random-Round-Robin** algorithm is inspired by the **Random-Improve** algorithm, but with some differences to accommodate UTxO sets with multiple assets.

## Top-Level Algorithm Description

The **Random-Round-Robin** algorithm considers the sum of all outputs **collectively**, selecting inputs to cover the total sum of all asset quantities, rather than running a separate selection for each input. It therefore drops the cardinality restriction of the original Random-Improve algorithm implementation.

### Steps

1. When selecting inputs, consider tokens in **round-robin fashion**, selecting one input per token before moving to the next token, to reduce the chance of over-selecting for any particular token.
2. For each token quantity under consideration, select enough inputs to cover at least 100% of that quantity, but aim to get as close to a target of 200% as possible. When we can make no further improvement for a given token, we eliminate that token from the round-robin selection process. An _improvement_ is defined as an additional selection that takes the total selected token quantity closer to 200% of the output token quantity, but not further away.
3. The round-robin selection phase terminates when we can make no further improvement for any token in the set under consideration.
4. After the selection phase is over, divide any excess token quantities (inputs − outputs) into change bundles, where:
    -  there is exactly one change bundle for each output.
    - the quantity of a given token in a change bundle is proportional to the quantity of that token in the corresponding output (modulo rounding).
    - the total quantity of a given token across all change bundles is equal to the total excess quantity of that token.
5. Redistribute additionally-selected tokens not present in the original outputs to the change bundles, where:
    - if there are fewer quantities for a given token than the number of change bundles, include these quantities without changing them.
    - if there are more quantities for a given token than the number of change bundles, repeatedly coalesce the smallest pair of quantities together until the total number of quantities is exactly equal to the number of change bundles.

Co-authored-by: Jonathan Knowles <[email protected]>
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Jan 20, 2021

Build failed:


Failures:

  src/Test/Integration/Scenario/API/Shelley/StakePools.hs:836:9: 
  1) API Specifications, SHELLEY_STAKE_POOLS, STAKE_POOLS_JOIN_01x - Fee boundary values, STAKE_POOLS_JOIN_01x - I cannot join if I have not enough fee to cover
       uncaught exception: RequestException
       DecodeFailure "{\"code\":\"network_query_failed\",\"message\":\"Unable to query the ledger at the moment. This error has been logged. Trying again in a bit might work.\"}"

  To rerun use: --match "/API Specifications/SHELLEY_STAKE_POOLS/STAKE_POOLS_JOIN_01x - Fee boundary values/STAKE_POOLS_JOIN_01x - I cannot join if I have not enough fee to cover/"

Randomized with seed 1318638626

#2320

@jonathanknowles
Copy link
Member Author

bors r+

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Jan 20, 2021

Build succeeded:

@iohk-bors iohk-bors bot merged commit dc3ed13 into master Jan 20, 2021
@iohk-bors iohk-bors bot deleted the jonathanknowles/multi-asset-coin-selection branch January 20, 2021 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ADDING FEATURE Mark a PR as adding a new feature, for auto-generated CHANGELOG
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants