Realistic Fee Calculation (Byron) #176

paweljakubas · 2019-04-24T09:44:07Z

Issue Number

Overview

I have generated a number of representative unit test cases
I have added the cardano fee estimation implementation that works with current tx support
I have added the property test to check fee estimation with direct counting of size after encoding

Comments

The idea behind adding fee estimation is the following:
(a) Inputs are always of the same size because they're a hash and index
(b) Outputs have variable sizes, but that can be easily determined regarding their value (<24 one byte, <256 2 bytes, <65536 3 bytes etc ... )
(c) tx witnesses, one per input, are also fixed-sized
(d) there is fixed-sized cbor bits (connected with building lists)

Hence, we can estimate quite precisely the size of transaction

KtorZ

Hmmm... I think this isn't going in the right direction and there are some confusion about what the fee estimation function is supposed to do.

KtorZ · 2019-04-24T10:52:49Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+--         the transaction attributes @Attributes ()@ are both hard-coded
+estimateCardanoFee
+    :: TxSizeLinear
+    -> (Tx, [TxWitness])


This signature is weird. We do really want to estimate fee of CoinSelection instead, because we don't yet have a transaction or witnesses at this stage.

now CoinSelection is used

KtorZ · 2019-04-24T10:54:09Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+    :: TxSizeLinear
+    -> (Tx, [TxWitness])
+    -> Fee
+estimateCardanoFee (TxSizeLinear a b) txWithWitness@(Tx inps _, _) =


We aren't we using the outputs of the transaction here?

KtorZ · 2019-04-24T12:10:26Z

cardano-wallet.cabal

@@ -111,6 +111,7 @@ test-suite unit
    , bytestring
    , cardano-crypto
    , cardano-wallet
+    , cassava


If anything, I'd suggest to use JSON as a (semi-)structured data format for storing data. We can cut-out an extra dependency to a csv parser this way 👍

KtorZ · 2019-04-24T12:11:37Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+-- | A linear equation on the transaction size. Represents the @\s -> a + b*s@
+-- function where @s@ is the transaction size in bytes, @a@ and @b@ are
+-- constant coefficients.
+data TxSizeLinear = TxSizeLinear Double Double


To make things slightly more explicit:

Double -> Quantity "byte" Double
Double -> Quantity "lovelace/byte" Double

will do it in the next commit

KtorZ · 2019-04-24T12:13:44Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+        -- `NetworkMainOrStage` as []; this should require a 5 Byte increase in
+        --`boundAddrAttrSize`. Because encoding in unit tests is not guaranteed
+        -- to be efficient, it was decided to increase by 7 Bytes to mitigate
+        -- against potential random test failures in the future.


I'd rather see a more concise comment explaining where the sizes come from:

-- - 34 => ? -- - 7 => 5 (network magic) + 2 ?

made more concise comments everywhere

KtorZ · 2019-04-24T12:16:31Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+    where
+        totalPayload :: Double
+        totalPayload = fromIntegral
+            $ boundAddrAttrSize + boundTxAttrSize + boundSignatureWitnessSize + payloadFromTxWithWitness + 4*(length inps - 1)


where does this 4* comes from ? Also, we should probably multiply the witness size by the number of witness (i.e. inputs) ...

KtorZ · 2019-04-24T12:17:52Z

test/data/Cardano/Wallet/fees

+[42000,42000,42000,42000,42000,42000,42000,42000,42000,42000,42000,42000,42000,42000,42000]|[100000,100000,100000]|302073
+[42000,42000,42000,42000,42000,42000,42000,42000,42000]|[100000,1000,100]|253294
+[42000,42000,42000,42000,42000,42000,42000,42000,42000]|[100000,10000,1]|253032
+[75000,75000,75000,60000]|[100,1,1]|212381


I think it's a bit overkill here :/ What's the value of that many cases compared to a few well-crafted unit tests + properties ?

I produced it in a quasi-random way just to see if fee is as expected by my when investigating old way of estimation. Will take representive examples from this file to test/data/Cardano/Wallet/fees1

paweljakubas · 2019-04-25T09:04:29Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+        sizeOfTxIns :: Int
+        sizeOfTxIns =
+            let n = length inps
+            in sizeOfListLen + n*sizeOfTxIn + (n-1)*sizeOfListBreak + (n-1)*2


(n-1)*2 is added artificially here - I cannot explain it at the moment. Just added in order to obtain the same results as with cardano-sl

paweljakubas · 2019-04-25T09:04:58Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+        totalPayload =
+            let n = length outs
+            in fromIntegral $
+               sizeOfListLen + sizeOfListBreak + sizeOfTx + sizeOfTxAttr + n*(sizeOfAddrAttr + sizeOfSignatureWitness) - (n-1)*6


(n-1)*6 is added artificially here - I cannot explain it at the moment. Just added in order to obtain the same results as with cardano-sl

paweljakubas · 2019-04-26T04:38:39Z

Reimplemented estimation (based on extensive private talk with @KtorZ). Now, we try not to to be exactly like in cardano-sl, but estimate fee based on current CBOR byte size of (Tx, [TxWitness]) . Added numerous unit tests that exemplify boundary transition cases. And also one property

KtorZ

I'd suggest to add the following comment somewhere, maybe at each relevant function which details the size of the tx encoding:

With `n` the number of inputs

signed tx ----------------------------------- 9 + n * 43 + n * 139 + Σ sizeOf(output)
          | list len 2              -- 1 
          | sizeOf(tx)              -- 6 + (n * 43) + sizeOf(outs)
          | list len n              -- 1-2
          | n * sizeOf(witness)     -- n * 139

tx        ---------------------------------- 6 + (n * 43) + 
          | list len 3                  -- 1 
          | begin                       -- 1 
          | n * sizeOf(input)           -- n * 43
          | break                       -- 1 
          | begin                       -- 1
          | Σ sizeOf(output)            --  Σ sizeOf(output)
          | break                       -- 1
          | empty attributes            -- 1 

input     ---------------------------------- 43
          | list len 2                  -- 1 
          | word8                       -- 1 
          | tag 24                      -- 2 
          | bytes ------------------------ 2 + 37 
          |   | list len 2  -- 1 
          |   | bytes       -- 2 + 32 
          |   | word32      -- 1-2 

output    ---------------------------------- 48-55 (mainnet) & 56-63 (testnet)
          | list len 2                  -- 1
          | sizeOf(address)             -- 46-54
          | word64                      -- 1-8

address   ---------------------------------- 46 (mainnet) & 54 (testnet)
          | list len 2                  -- 1
          | tag 24                      -- 2
          | bytes          --------------- 2 + 37-45
          |   | list len 3 -- 1
          |   | bytes      -- 2 + 32     
          |   | attributes -- 1-8       
          |   | word8      -- 1
          | word32                      -- 1-4 

witness   ---------------------------------- 139
          | list len 2                  -- 1
          | word8                       -- 1
          | tag 24                      -- 2
          | bytes          --------------- 2 + 133 
          |   | list len 2 -- 1
          |   | bytes      -- 2+64
          |   | bytes      -- 2+64

KtorZ · 2019-04-26T07:47:59Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+        -- see Cardano.Wallet.Binary (encodeTxOut)
+        sizeOfTxOut :: Word64 -> Int
+        sizeOfTxOut =
+            (+77) . fromIntegral . BL.length . toLazyByteString . encodeWord64


Why +77 here ? (I mean, I know why, but an unadvised reader won't so this sounds like a magical number). Also, I believe you based your calculation here for random addresses which are biggger and have a bigger payload. Sequential addresses are smaller to encode, so should also result to smaller fees (cf diagram in my last review comment)

(encodeWord64 -> nice trade-off here between encoding everything and avoiding re-implementing too much of the CBOR logic 👍)

added comments clarifying the stuff

KtorZ · 2019-04-26T07:49:21Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+        -- The size of TxWitness is always the same as it contains data constructor and hash
+        -- see Cardano.Wallet.Binary (encodeTxWitness)
+        sizeOfTxWitness :: Int
+        sizeOfTxWitness = 139


Why 139 here ?

added comments clarifying the stuff

KtorZ · 2019-04-26T07:50:44Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+        sizeOfTxCbor :: Int
+        sizeOfTxCbor = 6
+
+        -- The size of [[TxIn], [TxWitness]]


The comment seems off compare to what the body actually computes ^.^

KtorZ · 2019-04-26T07:51:41Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+        sizeOfTx =
+            let n = length inps
+                coins = map (getCoin . coin) outs
+            in sizeOfTxCbor + n*sizeOfTxIn + sum (map sizeOfTxOut coins)


Naming suggestion: sizeOfTxCbor --> cborOverhead +, make it part of the let clause, or, make every constituent explicit in a sum.

KtorZ · 2019-04-26T07:53:42Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+        -- The size of TxIn is always the same as it contains the hash and index
+        -- see Cardano.Wallet.Binary (encodeTxIn)
+        sizeOfTxIn :: Int
+        sizeOfTxIn = 42


Would be worth writing it as a + b where a == sizeOf(txin) && b == sizeOf(index), just to make the constituents explicit.

Actually, it would be nice to even decompose the CBOR stuff here:

list len (1 byte)

word8 (1 byte)

tag (2 bytes)

bytes (2 + 36-37 bytes)

list len (1 byte)

bytes (2 + 32 bytes)

word32 (1-2 bytes in practice)

total (42-43 bytes)

added comments clarifying the stuff

👍 Same remark here about the more conservative size (43 instead of 42)

KtorZ · 2019-04-26T08:44:55Z

test/unit/Cardano/Wallet/CoinSelection/FeeSpec.hs

+        -- | Make an Address from a Base58 encoded string, without error handling.
+        let addr58 :: ByteString -> Address
+            addr58 = maybe (error "addr58: Could not decode") Address
+                . decodeBase58 bitcoinAlphabet


Same as above, fromText 👍

KtorZ · 2019-04-26T09:11:05Z

test/unit/Cardano/Wallet/CoinSelection/FeeSpec.hs

+
+        let inputId0 = hash16 "60dbb2679ee920540c18195a3d92ee9be50aee6ed5f891d92d51db8a76b02cd2"
+        let address0 = addr58 "DdzFFzCqrhsug8jKBMV5Cr94hKY4DrbJtkUpqptoGEkovR2QSkcA\
+                              \cRgjnUyegE689qBX6b2kyxyNvCL6mfqiarzRB9TRq8zwJphR31pr"


This looks like an address with a derivation path payload (a.k.a an address from the random derivation scheme); Those are bigger in practice than the one we would use. It'd be nice to actually have the fee calculation depends on the address scheme we use, so that we can adjust fee depending on this :)

indeed, the examples were for Testnet/Random scheme

KtorZ · 2019-04-26T09:11:51Z

test/unit/Cardano/Wallet/CoinSelection/FeeSpec.hs

+                              \cRgjnUyegE689qBX6b2kyxyNvCL6mfqiarzRB9TRq8zwJphR31pr"
+        let pkWitness = "\130X@\226E\220\252\DLE\170\216\210\164\155\182mm$ePG\252\186\195\225_\b=\v\241=\255 \208\147[\239\RS\170|\214\202\247\169\229\205\187O_)\221\175\155?e\198\248\170\157-K\155\169z\144\174\ENQhX@\193\151*,\NULz\205\234\&1tL@\211\&2\165\129S\STXP\164C\176 Xvf\160|;\CANs{\SYN\204<N\207\154\130\225\229\t\172mbC\139\US\159\246\168x\163Mq\248\145)\160|\139\207-\SI"
+        let txIns = zipWith TxIn (replicate (length inps) inputId0) [0..]
+        let txOuts' = zipWith TxOut (replicate (length outs) address0) (map coin outs)


This is missing the change too ☝️

KtorZ · 2019-04-26T09:15:44Z

test/unit/Cardano/Wallet/CoinSelection/FeeSpec.hs

+        let (TxSizeLinear a b) = cardanoPolicy
+        let calcFee = ceiling (a + b*(fromIntegral calculatedSize))
+
+        estFee `shouldBe` calcFee


Comparing floating-point numbers through Eq is maybe not the best idea here and we may end up eventually with issues regarding floating-point arithmetic. Instead, I'd suggest to either:

Cast & Compare Fractionals (which have a sane memory representation for comparison in Haskell)

Compare the distance to a given small epsilon (e.g abs estFee calcFee shouldSatisfy (< 1e9))

calcFee is Word64 here (ceiling :: Double -> Word64). I agree that for floating-point numbers this is a different game :-)

Ah! Indeed. My bad ^.^

KtorZ · 2019-04-26T09:16:01Z

test/unit/Cardano/Wallet/CoinSelection/FeeSpec.hs

    before getSystemDRG $ describe "Fee calculation properties" $ do
        it "No fee gives back the same selection"
            (\_ -> property propSameSelection)
        it "Fee adjustment is deterministic when there's no extra inputs"
            (\_ -> property propDeterministic)
        it "Adjusting for fee (/= 0) reduces the change outputs or increase inputs"
            (property . propReducedChanges)
+        it "Estimated fee is the same as taken by encodeSignedTx"
+            (\_ -> property propFeeEstimation)


Good idea 👍

KtorZ · 2019-04-26T14:03:09Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+        --  | bytes          --------------- 2 + 48-56
+        --  |   | list len 3 -- 1
+        --  |   | bytes      -- 2 + 43
+        --  |   | attributes -- 1-8


I believe comment is wrong here, that's not the size of the root hash that is bigger for the random scheme, but the size of the attributes.

KtorZ · 2019-04-26T14:06:26Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+        totalPayload :: Double
+        totalPayload =
+            let n = length inps
+                cborOverhead = 2


The overhead actually depends on the number of inputs. When there are more than 23 inputs, the length is encoded on 2 bytes (and we can realistically assume that there's no tx with inputs bigger than 255); I'd say that sticking to the upper bound is more conservative here.

I believe you chose 2 in order to have the property in the unit test to pass. So maybe we could review the property condition to expect that: real fee are lower than or equal to estimated fee, with perhaps also, a predicate on the "error margin" (like, less than 5 Lovelace or so..)

leaned towards being precise here and added two test showing the case. hope you dont mind. there are two places where the length must be taken into account - for me it is worth introducing

KtorZ · 2019-04-26T14:07:05Z

src/Cardano/Wallet/CoinSelection/Fee.hs

+    deriving (Eq, Show)
+
+data AddressScheme = Sequential | Random
+    deriving (Eq, Show)


KtorZ

Some minor last remarks, but looks good to go overall!

… taking changes into account

…timation with only property tests

…global ENV

paweljakubas requested a review from KtorZ April 24, 2019 09:44

paweljakubas self-assigned this Apr 24, 2019

paweljakubas added the draft label Apr 24, 2019

paweljakubas added this to the Transaction creation, submission & Coin Selection milestone Apr 24, 2019

KtorZ suggested changes Apr 24, 2019

View reviewed changes

paweljakubas force-pushed the paweljakubas/92/realistic-fee-calculation branch from 224d507 to e896902 Compare April 25, 2019 05:17

paweljakubas commented Apr 25, 2019

View reviewed changes

paweljakubas force-pushed the paweljakubas/92/realistic-fee-calculation branch from 9d93082 to 4a65db4 Compare April 26, 2019 04:35

paweljakubas removed the draft label Apr 26, 2019

paweljakubas requested a review from KtorZ April 26, 2019 08:37

KtorZ mentioned this pull request Apr 26, 2019

Port coin selection and fee calculation from legacy #92

Closed

7 tasks

KtorZ suggested changes Apr 26, 2019

View reviewed changes

paweljakubas mentioned this pull request Apr 26, 2019

Fee estimation #186

Closed

paweljakubas force-pushed the paweljakubas/92/realistic-fee-calculation branch from a7d8aa7 to bd113f8 Compare April 26, 2019 13:09

paweljakubas requested a review from KtorZ April 26, 2019 13:33

KtorZ reviewed Apr 26, 2019

View reviewed changes

src/Cardano/Wallet/CoinSelection/Fee.hs Outdated

deriving (Eq, Show)

data AddressScheme = Sequential | Random

deriving (Eq, Show)

Copy link

Member

KtorZ Apr 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

KtorZ approved these changes Apr 26, 2019

View reviewed changes

paweljakubas force-pushed the paweljakubas/92/realistic-fee-calculation branch from 34786ac to b6dfaeb Compare April 26, 2019 14:49

KtorZ force-pushed the paweljakubas/92/realistic-fee-calculation branch from 6c87d3a to 09b090f Compare April 26, 2019 22:11

paweljakubas and others added 6 commits April 27, 2019 20:04

Proper estimation of fee based on current tx support

46ae6a9

Adding network and address derivation support, improving comments and…

9745a4e

… taking changes into account

review generators for fee estimation so that we'll fully cover fee es…

269e026

…timation with only property tests

remove 'Network' as an argument from .. everywhere, only rely on the …

73125cc

…global ENV

remove wrong test prefix 'Patate'

2cb2394

boost buildkite heap memory

e77c09a

KtorZ force-pushed the paweljakubas/92/realistic-fee-calculation branch from 09b090f to e77c09a Compare April 27, 2019 18:05

KtorZ changed the title ~~realistic fee calculation~~ Realistic Fee Calculation (Byron) Apr 27, 2019

KtorZ merged commit 4e2d11a into master Apr 27, 2019

KtorZ deleted the paweljakubas/92/realistic-fee-calculation branch April 27, 2019 19:59

Realistic Fee Calculation (Byron) #176

Realistic Fee Calculation (Byron) #176

Conversation

paweljakubas commented Apr 24, 2019 • edited Loading

Issue Number

Overview

Comments

KtorZ left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paweljakubas commented Apr 26, 2019

KtorZ left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paweljakubas Apr 26, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paweljakubas Apr 26, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KtorZ left a comment

Choose a reason for hiding this comment

paweljakubas commented Apr 24, 2019 •

edited

Loading

paweljakubas Apr 26, 2019 •

edited

Loading

paweljakubas Apr 26, 2019 •

edited

Loading