Improve function call fee estimation #4826

Longarithm · 2021-09-15T22:51:09Z

The current estimation function call fee must be improved, because it doesn't take into account the contract deserialization and thus complexity of the contract. This already leads to significant undercharging of huge contracts like Aurora.

Intention of the fee is to charge users for all operations not accounted by wasm gas counter, so estimation is made by counting instructions spent on no-op function call.

Simple approach

We can't charge constant fee, because gas spent varies on contract' complexity. But the algorithm of parsing raw wasm code should be linear, so the fee should be bounded by C * code.len().

The problem here is that when there are too many functions in contract, fee becomes very inaccurate, see #4660. We absolutely can't afford to charge 302 Tgas for each Aurora engine call, because 1) it is ~10 times overestimation 2) we could fit only 3 such tx in a block.

New approach

Let's make number of functions a separate factor to make estimation more accurate.
Take it as func_assoc here: https://github.com/near/wasmer/blob/0.18.1/lib/runtime-core/src/module.rs#L48
Experiment show that the effect shows up on all functions, not only exported ones.

So, from the above the following constants should exist:

gas_spent <= base_cost + cost_per_code_byte * code.len() + cost_per_function * module_info.func_assoc

Then we estimate these coefficients by LSM here using function least_squares_method_2: master...2c1e4fe

Current results are:

base_cost ~= 10 Ggas
cost_per_code_byte ~= 311 Mgas per KB
cost_per_function ~= 2 Ggas

It gives upper estimation on a simple test set if we multply coefficients by 2. Aurora fee should be ~4 Tgas after this.

Plan

Include args.len() to the estimation
Decide if we need safety multiplier
Validate results on real contracts
Check this effect on other items included to wasm modules:
- globals
- tables / elems
- memory
- exports / imports?
- different set of function bodies?

Other ideas

Set upper limit on methods number - worsens devx, can be an unnecessary bottleneck for developers
https://gov.near.org/t/boosting-top-1000-contracts-through-fee-reduction/4132/4 - makes fees very unstable
avoid multi-linear LSM by estimating coefficients independently. E.g. compare contracts with same lengths but different number of functions

The text was updated successfully, but these errors were encountered:

Longarithm · 2021-09-16T19:22:10Z

Validated cost_per_function as follows:

Take 2 contracts of the same length but different number of functions. Estimate cost_per_function as (cost_2 - cost_1)/(funcs_2 - funcs_1).
For simplicity, i-th function body is i32.const {i} drop possibly repeated several times.
In my construction, first contract C(N) contains 2 functions:

func0: i32.const 0 drop
func1: {i32.const 1 drop} repeated N times

Second contract contains N / 6 + 3 functions, which made lengths almost the same.

I considered the following values of N: {10, 20, ..., 100}, {75, 80, ..., 125}, {120, 140, ..., 200}, {1000, 2000, ... 10000}.
For almost all of them cost_per_function <= 2.2 Ggas.

There are several outliers with N ~= 100 for which cost_per_function is 3.7 Ggas (1.5 times higher). This effect comes out for 50 repeated run_vm calls and vanishes for 20 calls. For neighbouring estimations effect also vanishes.
This also may be related to warmup in the beginning.

Another outlier is C(100) itself. Current model estimates its call as 20 Ggas instead of 27 Ggas as it should be.

Anyway I'm still confident in safety-multiplying cost by 2:

we already have two methods of computing the coefficient
we have outliers only for small contract sizes
(?) functions are stored in vector, and if reallocation happens, it may give 2x overhead

bowenwang1996 · 2021-09-16T22:10:34Z

@Longarithm thanks for the investigation and detailed analysis! This provides really valuable insights into what the problem is and how we could address it. Some thoughts:

If we decide to use this model that takes number of methods into consideration, how should we make sure that the increase in fees doesn't break existing contracts?
Alternatively, is there something we could do to fundamentally improve the execution model so that we don't have to deal with fee increases?

Longarithm · 2021-09-17T18:41:24Z

@bowenwang1996
For current model, if contract length and functions number are lower than some constants (have to be found), there is no risk of fees increase. Fortunately, current base fee is 2.3 Tgas which is quite high comparing to new base fee which model estimates as 20 Ggas.

Hopefully there won't be a lot of other contracts and I can analyze them manually.

On the other hand, I found that call of no-op function state_migration on Aurora #4814 (comment) requires 10.7 Tgas. It means that base fee should be at least 10.7 Tgas and my model gives only 6.2 Tgas.

@nearmax says that it actually shouldn't break Aurora, because costs of their calls are much higher. But this may not be the case for other contracts.

It has several non-trivial imports, which could affect the cost. Need to sync with Contract Runtime on this.

Longarithm · 2021-09-18T01:50:44Z

When I add a bunch of imports same as in aurora engine (see https://hackmd.io/EvpgULvtSqGjyLMPicRIYg), some coefficients in the models significantly change:

base_cost = -30 Ggas
cost_per_code_byte = 1.5 Ggas per KB
cost_per_function = 2 Ggas

Though it was reasonable to expect only base cost change.

bowenwang1996 · 2021-09-18T17:29:45Z

Hopefully there won't be a lot of other contracts and I can analyze them manually.

That would be great!

some coefficients in the models significantly change:

Does that indicate the model itself may not be the best?

Longarithm · 2021-09-20T09:59:32Z

Does that indicate the model itself may not be the best?

Yes

MaksymZavershynskyi · 2021-09-22T01:06:17Z

We should consider rolling out best approximation now and then later if we find out that the model depends on more coefficients we can add new fees, e.g. cost per import. We can also add a limit on the max number of imports, e.g. 100 to make sure it is not abusable.

We are currently in a situation where our base function call fee does not reflect the actual time it takes to load the contract and we cannot be staying in this state for too long while we are trying to find the best model for loading the contract.

matklad · 2021-09-22T11:16:40Z

🤔 I don't think "just add more parameters later" is a safe solution -- the estimation error might be in the wrong direction.

Imagine that we didn't find the import's issue, and rolled out the cost based on number of fns + len. Then someone could submit a contract with zero functions and a tonne of imports. We'd say that the contract is cheap, but it will be costly in practice.

We know about the imports now, of course, but what if there's some other thing (globals, exports, particular control flow) which exhibits a similar effect?

Estimation based on a single parameter (code length) is more robust to abuse. We are not 100% sure that "many functions" captures the worst use-case, but that seems rather probable. If we add number of functions as a model parameter, we'd have no idea about what is the new worst-case.

That being said, our current cost is just wrong, so anything would be better than it.

If we want to go the multidimensional-cost way, I think we should look not only at the number of functions, but on the overall complexity of the contract. That is, we should add together lenghths of all the sections in https://webassembly.github.io/spec/core/binary/modules.html.

Longarithm · 2021-09-22T14:43:23Z

I like the idea of complexity factor as a compromise:

it should not depend on exact wasmer implementation
it includes number of functions
this may work in short-term, and exact value can be clarified later

Though intuitively sections are not equal-weighted, and we need to keep in mind that we just cover the problem we have by some fee with ambiguous name.

Regarding functions or imports number - instead of limiting them, we could apply different fee schemes based on their values, like

n_functions < 2000 => charge optimistic fee
n_functions >= 2000 => charge pessimistic fee

bowenwang1996 · 2021-09-22T17:17:47Z

we could apply different fee schemes based on their values

I think this is an interesting idea! It also makes it easier for developers to reason about the cost of function call transactions.

Longarithm · 2021-09-23T00:17:17Z

Reducing aurora code leads to some insights.

I left only no-op state_migration and deploy_code exported methods in the contract (attached). This results in 618 KB code with 781 function for which no-op function spends 6.7 Tgas.

This itself implies that logically either cost_per_code_byte >= 10 Ggas / KB or cost_per_function >= 10 Ggas, none of which is the case for model trained on synthetic contracts.

At this point, as a short-term solution I suggest to leave only one parameter in model and put:

base_cost = 4.1 Tgas
if n_functions <= 2000, cost_per_code_byte = 20 Ggas / KB 
if n_functions > 2000, cost_per_code_byte = 200 Ggas / KB

This serves the following purposes:

for the worst abuse scenario we know - contract in which all functions are no-op - cost is fair because for n_functions <= 2000 the worst case is when equality is attained and then fee is exactly 4.1 Tgas, and for n_functions > 2000 cost_per_code_byte is exactly 200 Ggas / KB for such worst contracts, being computed here;
aurora fee is 22 Tgas (size = 890 KB) because it has less than 2000 functions, and it is close to actual fee which should be 10 Tgas;
for average contract, fee should not change catastrophically (yet to be checked on existing contracts)

Drawback is that we increase base fee by 2x. On the other hand, upgrade to new wasmer should compensate it.

BTW raw complexity idea doesn't work here - n_functions is significantly larger term in all examples, and one import increases functions number in our wasmer implementation as well.

aurora_engine_with_deploy.wasm.zip

MaksymZavershynskyi · 2021-09-23T00:35:01Z

There are issues with "complexity" metric:

It will be part of the protocol, so we need to make sure it is not extremely complex;
I've had bad experiences with trying to aggregate a bunch of stuff signals into a package metric -- there will likely be corner cases where it behaves awfully and it will be hard to reason about it. E.g. Wasm can have very disproportional sections.

I like the idea of a simple cost function with n_functions == 2000 as a threshold that cuts off "weird" contracts. Though unfortunately, it adds a non-linear function into our fees.

matklad · 2021-09-23T15:51:50Z

n_functions == 2000 as a threshold that cuts off "weird" contracts

I am not yet convinced that it does actually cut of "weird" contracts -- it cuts one weird case we know about, so it's overfit to it. In other words, the reasoning behind "for the worst abuse scenario we know" is not entirely sound. The worst-case is relative to the metric. Metric with cutoff will have a different worst-case, which we don't know yet.

To feel more confident about metrics which count functions, I would like to understand this:

What does the slowest 1mb contract with at most 1k functions looks like?

matklad · 2021-09-23T16:55:20Z

It's also the case that, no matter which metric we chose, there always be the gap between the worst-case for the metric, and the average/best case on the network.

This puts us in an uncomfortable situation -- there is incentive to make the metrics more complicated to reduce worst/best case overhead. But more complex metrics are easier to abuse.

That said, for each specific contract, we can always get the approximate real cost. We can take the contract, run parameter estimator, and say "loading this contract takes about X gas".

@olonho suggested an interesting way out of this systematic problem -- let the validators agree, using whatever mechanism, which contracts are actually "cheap". That is, after a particular code is deployed, validators can verify that it is actually cheap to instantiate. This allows us to keep estimated costs as a safe approximation to prevent abuse by pathalogical contracts, while making important normal contracts cheap on a case-by-case basis.

Strictly from a contract runtime perspective, I would be much more comfortable saying that a particular contract with <2000 functions is cheap, than saying that any contract with <2000 functions is cheap.

bowenwang1996 · 2021-09-23T18:47:21Z

That is, after a particular code is deployed, validators can verify that it is actually cheap to instantiate.

How does this work in practice? Do you mean that validators would need to agree on the exact cost of each contract? I am not sure how we want to do this and it feels to me like a lot of overhead. In addition, what if the contact is invoked in the same block when it is deployed? If we say that we use the estimated cost here and later validators could change the cost, it means that for transactions that invoke the same method on the same contract, they could end up having very different costs, which is not optimal

Stabilize limiting functions number in a contract to 10_000: #4954 While discussing #4826, we found that both contract compilation and function call require a lot of time, if number of wasm functions in it is huge. Even if we fix contract size, function number still makes a big difference. To mitigate this issue, we limit number of functions in a contract. This shouldn't affect current mainnet contracts because maximal number of functions is ~5k. ## Test plan * http://nayduck.near.org/#/run/2223 * existing tests

stale · 2021-12-22T20:08:27Z

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months.
It will be closed in 7 days if no further activity occurs.
Thank you for your contributions.

stale · 2022-03-29T00:04:27Z

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months.
It will be closed in 7 days if no further activity occurs.
Thank you for your contributions.

stale · 2022-07-06T15:38:23Z

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months.
It will be closed in 7 days if no further activity occurs.
Thank you for your contributions.

jakmeier · 2022-09-30T14:04:59Z

The discussion in this issue seems mostly outdated, the last comment is now 1 year old.

In the meantime, we have improved estimations in slightly different directions with PR #6362, and we have come up with a "true function call cost" in #7227. This should address the main concerns raised here in this issue.

The further reaching discussion for changing the gas parameter model has also been evaluated in #6992, which is now also done. Currently we don't see a pressing reason anymore to change the set of parameters. But we do want to reshuffle them, see #7741.

@Longarithm do you think we will have more things done on this particular issue? Or can we perhaps close this issue by now?

Longarithm · 2022-09-30T16:04:04Z

Superseded by #7741.

Longarithm self-assigned this Sep 15, 2021

Longarithm added the A-params-estimator Area: runtime params estimator label Sep 15, 2021

Longarithm mentioned this issue Oct 8, 2021

feat: limit contract functions number #4954

Merged

2 tasks

Longarithm mentioned this issue Nov 18, 2021

feat: stabilize limiting functions number in a contract #5361

Merged

stale bot added the S-stale label Dec 22, 2021

bowenwang1996 added the T-contract-runtime Team: issues relevant to the contract runtime team label Dec 28, 2021

stale bot removed the S-stale label Dec 28, 2021

matklad mentioned this issue Jan 4, 2022

protocol_feature_fix_contract_loading_cost tracking issue #5962

Closed

1 task

jakmeier mentioned this issue Feb 25, 2022

Clear up estimations around function call costs #6353

Closed

1 task

stale bot added the S-stale label Mar 29, 2022

jakmeier mentioned this issue Mar 30, 2022

doc: Describe fee estimations #6509

Merged

jakmeier removed the S-stale label Apr 7, 2022

jakmeier mentioned this issue Jun 7, 2022

New Function Call Cost Model (Without parameter estimation) #6992

Closed

stale bot added the S-stale label Jul 6, 2022

akhi3030 removed the S-stale label Jul 8, 2022

Longarithm closed this as completed Sep 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve function call fee estimation #4826

Improve function call fee estimation #4826

Longarithm commented Sep 15, 2021 •

edited

Loading

Longarithm commented Sep 16, 2021 •

edited

Loading

bowenwang1996 commented Sep 16, 2021

Longarithm commented Sep 17, 2021 •

edited

Loading

Longarithm commented Sep 18, 2021

bowenwang1996 commented Sep 18, 2021

Longarithm commented Sep 20, 2021

MaksymZavershynskyi commented Sep 22, 2021

matklad commented Sep 22, 2021

Longarithm commented Sep 22, 2021

bowenwang1996 commented Sep 22, 2021

Longarithm commented Sep 23, 2021 •

edited

Loading

MaksymZavershynskyi commented Sep 23, 2021

matklad commented Sep 23, 2021

matklad commented Sep 23, 2021

bowenwang1996 commented Sep 23, 2021

stale bot commented Dec 22, 2021

stale bot commented Mar 29, 2022

stale bot commented Jul 6, 2022

jakmeier commented Sep 30, 2022

Longarithm commented Sep 30, 2022

Improve function call fee estimation #4826

Improve function call fee estimation #4826

Comments

Longarithm commented Sep 15, 2021 • edited Loading

Simple approach

New approach

Plan

Other ideas

Longarithm commented Sep 16, 2021 • edited Loading

bowenwang1996 commented Sep 16, 2021

Longarithm commented Sep 17, 2021 • edited Loading

Longarithm commented Sep 18, 2021

bowenwang1996 commented Sep 18, 2021

Longarithm commented Sep 20, 2021

MaksymZavershynskyi commented Sep 22, 2021

matklad commented Sep 22, 2021

Longarithm commented Sep 22, 2021

bowenwang1996 commented Sep 22, 2021

Longarithm commented Sep 23, 2021 • edited Loading

MaksymZavershynskyi commented Sep 23, 2021

matklad commented Sep 23, 2021

matklad commented Sep 23, 2021

bowenwang1996 commented Sep 23, 2021

stale bot commented Dec 22, 2021

stale bot commented Mar 29, 2022

stale bot commented Jul 6, 2022

jakmeier commented Sep 30, 2022

Longarithm commented Sep 30, 2022

Longarithm commented Sep 15, 2021 •

edited

Loading

Longarithm commented Sep 16, 2021 •

edited

Loading

Longarithm commented Sep 17, 2021 •

edited

Loading

Longarithm commented Sep 23, 2021 •

edited

Loading