-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve function call fee estimation #4826
Comments
Validated Take 2 contracts of the same length but different number of functions. Estimate
Second contract contains I considered the following values of There are several outliers with Another outlier is Anyway I'm still confident in safety-multiplying cost by 2:
|
@Longarithm thanks for the investigation and detailed analysis! This provides really valuable insights into what the problem is and how we could address it. Some thoughts:
|
@bowenwang1996 Hopefully there won't be a lot of other contracts and I can analyze them manually. On the other hand, I found that call of no-op function @nearmax says that it actually shouldn't break Aurora, because costs of their calls are much higher. But this may not be the case for other contracts. It has several non-trivial imports, which could affect the cost. Need to sync with Contract Runtime on this. |
When I add a bunch of imports same as in aurora engine (see https://hackmd.io/EvpgULvtSqGjyLMPicRIYg), some coefficients in the models significantly change:
Though it was reasonable to expect only base cost change. |
That would be great!
Does that indicate the model itself may not be the best? |
Yes |
We should consider rolling out best approximation now and then later if we find out that the model depends on more coefficients we can add new fees, e.g. cost per import. We can also add a limit on the max number of imports, e.g. 100 to make sure it is not abusable. We are currently in a situation where our base function call fee does not reflect the actual time it takes to load the contract and we cannot be staying in this state for too long while we are trying to find the best model for loading the contract. |
🤔 I don't think "just add more parameters later" is a safe solution -- the estimation error might be in the wrong direction. Imagine that we didn't find the import's issue, and rolled out the cost based on number of fns + len. Then someone could submit a contract with zero functions and a tonne of imports. We'd say that the contract is cheap, but it will be costly in practice. We know about the imports now, of course, but what if there's some other thing (globals, exports, particular control flow) which exhibits a similar effect? Estimation based on a single parameter (code length) is more robust to abuse. We are not 100% sure that "many functions" captures the worst use-case, but that seems rather probable. If we add number of functions as a model parameter, we'd have no idea about what is the new worst-case. That being said, our current cost is just wrong, so anything would be better than it. If we want to go the multidimensional-cost way, I think we should look not only at the number of functions, but on the overall complexity of the contract. That is, we should add together lenghths of all the sections in https://webassembly.github.io/spec/core/binary/modules.html. |
I like the idea of
Though intuitively sections are not equal-weighted, and we need to keep in mind that we just cover the problem we have by some fee with ambiguous name. Regarding
|
I think this is an interesting idea! It also makes it easier for developers to reason about the cost of function call transactions. |
Reducing aurora code leads to some insights. I left only no-op This itself implies that logically either At this point, as a short-term solution I suggest to leave only one parameter in model and put:
This serves the following purposes:
Drawback is that we increase base fee by 2x. On the other hand, upgrade to new wasmer should compensate it. BTW raw complexity idea doesn't work here - |
There are issues with "complexity" metric:
I like the idea of a simple cost function with |
I am not yet convinced that it does actually cut of "weird" contracts -- it cuts one weird case we know about, so it's overfit to it. In other words, the reasoning behind "for the worst abuse scenario we know" is not entirely sound. The worst-case is relative to the metric. Metric with cutoff will have a different worst-case, which we don't know yet. To feel more confident about metrics which count functions, I would like to understand this:
|
It's also the case that, no matter which metric we chose, there always be the gap between the worst-case for the metric, and the average/best case on the network. This puts us in an uncomfortable situation -- there is incentive to make the metrics more complicated to reduce worst/best case overhead. But more complex metrics are easier to abuse. That said, for each specific contract, we can always get the approximate real cost. We can take the contract, run parameter estimator, and say "loading this contract takes about X gas". @olonho suggested an interesting way out of this systematic problem -- let the validators agree, using whatever mechanism, which contracts are actually "cheap". That is, after a particular code is deployed, validators can verify that it is actually cheap to instantiate. This allows us to keep estimated costs as a safe approximation to prevent abuse by pathalogical contracts, while making important normal contracts cheap on a case-by-case basis. Strictly from a contract runtime perspective, I would be much more comfortable saying that a particular contract with <2000 functions is cheap, than saying that any contract with <2000 functions is cheap. |
How does this work in practice? Do you mean that validators would need to agree on the exact cost of each contract? I am not sure how we want to do this and it feels to me like a lot of overhead. In addition, what if the contact is invoked in the same block when it is deployed? If we say that we use the estimated cost here and later validators could change the cost, it means that for transactions that invoke the same method on the same contract, they could end up having very different costs, which is not optimal |
Stabilize limiting functions number in a contract to 10_000: #4954 While discussing #4826, we found that both contract compilation and function call require a lot of time, if number of wasm functions in it is huge. Even if we fix contract size, function number still makes a big difference. To mitigate this issue, we limit number of functions in a contract. This shouldn't affect current mainnet contracts because maximal number of functions is ~5k. ## Test plan * http://nayduck.near.org/#/run/2223 * existing tests
This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. |
This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. |
This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. |
The discussion in this issue seems mostly outdated, the last comment is now 1 year old. In the meantime, we have improved estimations in slightly different directions with PR #6362, and we have come up with a "true function call cost" in #7227. This should address the main concerns raised here in this issue. The further reaching discussion for changing the gas parameter model has also been evaluated in #6992, which is now also done. Currently we don't see a pressing reason anymore to change the set of parameters. But we do want to reshuffle them, see #7741. @Longarithm do you think we will have more things done on this particular issue? Or can we perhaps close this issue by now? |
Superseded by #7741. |
The current estimation function call fee must be improved, because it doesn't take into account the contract deserialization and thus complexity of the contract. This already leads to significant undercharging of huge contracts like Aurora.
Intention of the fee is to charge users for all operations not accounted by wasm gas counter, so estimation is made by counting instructions spent on no-op function call.
Simple approach
We can't charge constant fee, because gas spent varies on contract' complexity. But the algorithm of parsing raw wasm code should be linear, so the fee should be bounded by
C * code.len()
.The problem here is that when there are too many functions in contract, fee becomes very inaccurate, see #4660. We absolutely can't afford to charge 302 Tgas for each Aurora engine call, because 1) it is ~10 times overestimation 2) we could fit only 3 such tx in a block.
New approach
Let's make number of functions a separate factor to make estimation more accurate.
Take it as
func_assoc
here: https://github.com/near/wasmer/blob/0.18.1/lib/runtime-core/src/module.rs#L48Experiment show that the effect shows up on all functions, not only exported ones.
So, from the above the following constants should exist:
Then we estimate these coefficients by LSM here using function
least_squares_method_2
: master...2c1e4feCurrent results are:
It gives upper estimation on a simple test set if we multply coefficients by 2. Aurora fee should be ~4 Tgas after this.
Plan
args.len()
to the estimationOther ideas
The text was updated successfully, but these errors were encountered: