Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Realistic Fee Calculation (Byron) #176

Merged
merged 6 commits into from
Apr 27, 2019

Conversation

paweljakubas
Copy link
Contributor

@paweljakubas paweljakubas commented Apr 24, 2019

Issue Number

#92

Overview

  • I have generated a number of representative unit test cases
  • I have added the cardano fee estimation implementation that works with current tx support
  • I have added the property test to check fee estimation with direct counting of size after encoding

Comments

The idea behind adding fee estimation is the following:
(a) Inputs are always of the same size because they're a hash and index
(b) Outputs have variable sizes, but that can be easily determined regarding their value (<24 one byte, <256 2 bytes, <65536 3 bytes etc ... )
(c) tx witnesses, one per input, are also fixed-sized
(d) there is fixed-sized cbor bits (connected with building lists)

Hence, we can estimate quite precisely the size of transaction

Copy link
Member

@KtorZ KtorZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I think this isn't going in the right direction and there are some confusion about what the fee estimation function is supposed to do.

-- the transaction attributes @Attributes ()@ are both hard-coded
estimateCardanoFee
:: TxSizeLinear
-> (Tx, [TxWitness])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This signature is weird. We do really want to estimate fee of CoinSelection instead, because we don't yet have a transaction or witnesses at this stage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now CoinSelection is used

:: TxSizeLinear
-> (Tx, [TxWitness])
-> Fee
estimateCardanoFee (TxSizeLinear a b) txWithWitness@(Tx inps _, _) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't we using the outputs of the transaction here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now using

@@ -111,6 +111,7 @@ test-suite unit
, bytestring
, cardano-crypto
, cardano-wallet
, cassava
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anything, I'd suggest to use JSON as a (semi-)structured data format for storing data. We can cut-out an extra dependency to a csv parser this way 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

-- | A linear equation on the transaction size. Represents the @\s -> a + b*s@
-- function where @s@ is the transaction size in bytes, @a@ and @b@ are
-- constant coefficients.
data TxSizeLinear = TxSizeLinear Double Double
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make things slightly more explicit:

Double -> Quantity "byte" Double
Double -> Quantity "lovelace/byte" Double

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do it in the next commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

-- `NetworkMainOrStage` as []; this should require a 5 Byte increase in
--`boundAddrAttrSize`. Because encoding in unit tests is not guaranteed
-- to be efficient, it was decided to increase by 7 Bytes to mitigate
-- against potential random test failures in the future.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather see a more concise comment explaining where the sizes come from:

-- - 34 => ?
-- - 7 => 5 (network magic) + 2 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made more concise comments everywhere

where
totalPayload :: Double
totalPayload = fromIntegral
$ boundAddrAttrSize + boundTxAttrSize + boundSignatureWitnessSize + payloadFromTxWithWitness + 4*(length inps - 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does this 4* comes from ? Also, we should probably multiply the witness size by the number of witness (i.e. inputs) ...

[42000,42000,42000,42000,42000,42000,42000,42000,42000,42000,42000,42000,42000,42000,42000]|[100000,100000,100000]|302073
[42000,42000,42000,42000,42000,42000,42000,42000,42000]|[100000,1000,100]|253294
[42000,42000,42000,42000,42000,42000,42000,42000,42000]|[100000,10000,1]|253032
[75000,75000,75000,60000]|[100,1,1]|212381
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bit overkill here :/ What's the value of that many cases compared to a few well-crafted unit tests + properties ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I produced it in a quasi-random way just to see if fee is as expected by my when investigating old way of estimation. Will take representive examples from this file to test/data/Cardano/Wallet/fees1

@paweljakubas paweljakubas force-pushed the paweljakubas/92/realistic-fee-calculation branch from 224d507 to e896902 Compare April 25, 2019 05:17
sizeOfTxIns :: Int
sizeOfTxIns =
let n = length inps
in sizeOfListLen + n*sizeOfTxIn + (n-1)*sizeOfListBreak + (n-1)*2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(n-1)*2 is added artificially here - I cannot explain it at the moment. Just added in order to obtain the same results as with cardano-sl

totalPayload =
let n = length outs
in fromIntegral $
sizeOfListLen + sizeOfListBreak + sizeOfTx + sizeOfTxAttr + n*(sizeOfAddrAttr + sizeOfSignatureWitness) - (n-1)*6
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(n-1)*6 is added artificially here - I cannot explain it at the moment. Just added in order to obtain the same results as with cardano-sl

@paweljakubas paweljakubas force-pushed the paweljakubas/92/realistic-fee-calculation branch from 9d93082 to 4a65db4 Compare April 26, 2019 04:35
@paweljakubas
Copy link
Contributor Author

Reimplemented estimation (based on extensive private talk with @KtorZ). Now, we try not to to be exactly like in cardano-sl, but estimate fee based on current CBOR byte size of (Tx, [TxWitness]) . Added numerous unit tests that exemplify boundary transition cases. And also one property

Copy link
Member

@KtorZ KtorZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to add the following comment somewhere, maybe at each relevant function which details the size of the tx encoding:

With `n` the number of inputs

signed tx ----------------------------------- 9 + n * 43 + n * 139 + Σ sizeOf(output)
          | list len 2              -- 1 
          | sizeOf(tx)              -- 6 + (n * 43) + sizeOf(outs)
          | list len n              -- 1-2
          | n * sizeOf(witness)     -- n * 139

tx        ---------------------------------- 6 + (n * 43) + 
          | list len 3                  -- 1 
          | begin                       -- 1 
          | n * sizeOf(input)           -- n * 43
          | break                       -- 1 
          | begin                       -- 1
          | Σ sizeOf(output)            --  Σ sizeOf(output)
          | break                       -- 1
          | empty attributes            -- 1 

input     ---------------------------------- 43
          | list len 2                  -- 1 
          | word8                       -- 1 
          | tag 24                      -- 2 
          | bytes ------------------------ 2 + 37 
          |   | list len 2  -- 1 
          |   | bytes       -- 2 + 32 
          |   | word32      -- 1-2 

output    ---------------------------------- 48-55 (mainnet) & 56-63 (testnet)
          | list len 2                  -- 1
          | sizeOf(address)             -- 46-54
          | word64                      -- 1-8

address   ---------------------------------- 46 (mainnet) & 54 (testnet)
          | list len 2                  -- 1
          | tag 24                      -- 2
          | bytes          --------------- 2 + 37-45
          |   | list len 3 -- 1
          |   | bytes      -- 2 + 32     
          |   | attributes -- 1-8       
          |   | word8      -- 1
          | word32                      -- 1-4 

witness   ---------------------------------- 139
          | list len 2                  -- 1
          | word8                       -- 1
          | tag 24                      -- 2
          | bytes          --------------- 2 + 133 
          |   | list len 2 -- 1
          |   | bytes      -- 2+64
          |   | bytes      -- 2+64

-- see Cardano.Wallet.Binary (encodeTxOut)
sizeOfTxOut :: Word64 -> Int
sizeOfTxOut =
(+77) . fromIntegral . BL.length . toLazyByteString . encodeWord64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why +77 here ? (I mean, I know why, but an unadvised reader won't so this sounds like a magical number). Also, I believe you based your calculation here for random addresses which are biggger and have a bigger payload. Sequential addresses are smaller to encode, so should also result to smaller fees (cf diagram in my last review comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(encodeWord64 -> nice trade-off here between encoding everything and avoiding re-implementing too much of the CBOR logic 👍)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments clarifying the stuff

-- The size of TxWitness is always the same as it contains data constructor and hash
-- see Cardano.Wallet.Binary (encodeTxWitness)
sizeOfTxWitness :: Int
sizeOfTxWitness = 139
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 139 here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments clarifying the stuff

sizeOfTxCbor :: Int
sizeOfTxCbor = 6

-- The size of [[TxIn], [TxWitness]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment seems off compare to what the body actually computes ^.^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

sizeOfTx =
let n = length inps
coins = map (getCoin . coin) outs
in sizeOfTxCbor + n*sizeOfTxIn + sum (map sizeOfTxOut coins)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming suggestion: sizeOfTxCbor --> cborOverhead +, make it part of the let clause, or, make every constituent explicit in a sum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

-- The size of TxIn is always the same as it contains the hash and index
-- see Cardano.Wallet.Binary (encodeTxIn)
sizeOfTxIn :: Int
sizeOfTxIn = 42
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be worth writing it as a + b where a == sizeOf(txin) && b == sizeOf(index), just to make the constituents explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it would be nice to even decompose the CBOR stuff here:

  • list len (1 byte)
  • word8 (1 byte)
  • tag (2 bytes)
  • bytes (2 + 36-37 bytes)
    • list len (1 byte)
    • bytes (2 + 32 bytes)
    • word32 (1-2 bytes in practice)

total (42-43 bytes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments clarifying the stuff

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Same remark here about the more conservative size (43 instead of 42)

-- | Make an Address from a Base58 encoded string, without error handling.
let addr58 :: ByteString -> Address
addr58 = maybe (error "addr58: Could not decode") Address
. decodeBase58 bitcoinAlphabet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, fromText 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


let inputId0 = hash16 "60dbb2679ee920540c18195a3d92ee9be50aee6ed5f891d92d51db8a76b02cd2"
let address0 = addr58 "DdzFFzCqrhsug8jKBMV5Cr94hKY4DrbJtkUpqptoGEkovR2QSkcA\
\cRgjnUyegE689qBX6b2kyxyNvCL6mfqiarzRB9TRq8zwJphR31pr"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an address with a derivation path payload (a.k.a an address from the random derivation scheme); Those are bigger in practice than the one we would use. It'd be nice to actually have the fee calculation depends on the address scheme we use, so that we can adjust fee depending on this :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, the examples were for Testnet/Random scheme

\cRgjnUyegE689qBX6b2kyxyNvCL6mfqiarzRB9TRq8zwJphR31pr"
let pkWitness = "\130X@\226E\220\252\DLE\170\216\210\164\155\182mm$ePG\252\186\195\225_\b=\v\241=\255 \208\147[\239\RS\170|\214\202\247\169\229\205\187O_)\221\175\155?e\198\248\170\157-K\155\169z\144\174\ENQhX@\193\151*,\NULz\205\234\&1tL@\211\&2\165\129S\STXP\164C\176 Xvf\160|;\CANs{\SYN\204<N\207\154\130\225\229\t\172mbC\139\US\159\246\168x\163Mq\248\145)\160|\139\207-\SI"
let txIns = zipWith TxIn (replicate (length inps) inputId0) [0..]
let txOuts' = zipWith TxOut (replicate (length outs) address0) (map coin outs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the change too ☝️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected

let (TxSizeLinear a b) = cardanoPolicy
let calcFee = ceiling (a + b*(fromIntegral calculatedSize))

estFee `shouldBe` calcFee
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing floating-point numbers through Eq is maybe not the best idea here and we may end up eventually with issues regarding floating-point arithmetic. Instead, I'd suggest to either:

  • Cast & Compare Fractionals (which have a sane memory representation for comparison in Haskell)
  • Compare the distance to a given small epsilon (e.g abs estFee calcFee shouldSatisfy (< 1e9))

Copy link
Contributor Author

@paweljakubas paweljakubas Apr 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calcFee is Word64 here (ceiling :: Double -> Word64). I agree that for floating-point numbers this is a different game :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Indeed. My bad ^.^

before getSystemDRG $ describe "Fee calculation properties" $ do
it "No fee gives back the same selection"
(\_ -> property propSameSelection)
it "Fee adjustment is deterministic when there's no extra inputs"
(\_ -> property propDeterministic)
it "Adjusting for fee (/= 0) reduces the change outputs or increase inputs"
(property . propReducedChanges)
it "Estimated fee is the same as taken by encodeSignedTx"
(\_ -> property propFeeEstimation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea 👍

@paweljakubas paweljakubas mentioned this pull request Apr 26, 2019
@paweljakubas paweljakubas force-pushed the paweljakubas/92/realistic-fee-calculation branch from a7d8aa7 to bd113f8 Compare April 26, 2019 13:09
@paweljakubas paweljakubas requested a review from KtorZ April 26, 2019 13:33
-- | bytes --------------- 2 + 48-56
-- | | list len 3 -- 1
-- | | bytes -- 2 + 43
-- | | attributes -- 1-8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe comment is wrong here, that's not the size of the root hash that is bigger for the random scheme, but the size of the attributes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected

totalPayload :: Double
totalPayload =
let n = length inps
cborOverhead = 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overhead actually depends on the number of inputs. When there are more than 23 inputs, the length is encoded on 2 bytes (and we can realistically assume that there's no tx with inputs bigger than 255); I'd say that sticking to the upper bound is more conservative here.

I believe you chose 2 in order to have the property in the unit test to pass. So maybe we could review the property condition to expect that: real fee are lower than or equal to estimated fee, with perhaps also, a predicate on the "error margin" (like, less than 5 Lovelace or so..)

Copy link
Contributor Author

@paweljakubas paweljakubas Apr 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaned towards being precise here and added two test showing the case. hope you dont mind. there are two places where the length must be taken into account - for me it is worth introducing

deriving (Eq, Show)

data AddressScheme = Sequential | Random
deriving (Eq, Show)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@KtorZ KtorZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor last remarks, but looks good to go overall!

@paweljakubas paweljakubas force-pushed the paweljakubas/92/realistic-fee-calculation branch from 34786ac to b6dfaeb Compare April 26, 2019 14:49
@KtorZ KtorZ force-pushed the paweljakubas/92/realistic-fee-calculation branch from 6c87d3a to 09b090f Compare April 26, 2019 22:11
@KtorZ KtorZ force-pushed the paweljakubas/92/realistic-fee-calculation branch from 09b090f to e77c09a Compare April 27, 2019 18:05
@KtorZ KtorZ changed the title realistic fee calculation Realistic Fee Calculation (Byron) Apr 27, 2019
@KtorZ KtorZ merged commit 4e2d11a into master Apr 27, 2019
@KtorZ KtorZ deleted the paweljakubas/92/realistic-fee-calculation branch April 27, 2019 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants