Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Compute Costs calculation when reaching gas limit #8908

Closed
Tracked by #8032
aborg-dev opened this issue Apr 11, 2023 · 5 comments
Closed
Tracked by #8032

Fix Compute Costs calculation when reaching gas limit #8908

aborg-dev opened this issue Apr 11, 2023 · 5 comments
Assignees
Labels
A-contract-runtime Area: contract compilation and execution, virtual machines, etc

Comments

@aborg-dev
Copy link
Contributor

We needed to revert Compute Costs calculation during receipts processing because it was inconsistent with a burnt gas estimate. Incorrect accounting happens when as a part of the chunk we have a function call that runs out of gas. See this test for a concrete example: dfc0bce.

This happens because the compute costs are currently based on the gas profiling information which deviates from the burnt gas when function call reaches one of the gas limits (prepaid_gas or max_gas_burnt) and we need to cap burnt gas to honor these limits.

To resolve this we either need to make profiles account for that limit, or we need to calculate compute cost directly in the gas counter.

@aborg-dev aborg-dev added the A-contract-runtime Area: contract compilation and execution, virtual machines, etc label Apr 11, 2023
@aborg-dev aborg-dev self-assigned this Apr 11, 2023
@aborg-dev
Copy link
Contributor Author

If we decide to keep using profiles, another implication of this corner case would be that #8795 would require fractional counters for the calls that overflow the gas limit, which sounds error-prone.
CC @jakmeier

@Longarithm
Copy link
Member

I'm interested in stabilizing it against 1.34 (happens in 2 weeks?), so I can help if needed.

@aborg-dev
Copy link
Contributor Author

I think it will be cleaner to implement compute costs directly in the gas counter. This would mean modifying places that call burn_gas and deduct_gas methods to also update compute costs. Will give it a try.

@jakmeier
Copy link
Contributor

I think it will be cleaner to implement compute costs directly in the gas counter. This would mean modifying places that call burn_gas and deduct_gas methods to also update compute costs. Will give it a try.

Sounds like potentially a lot of work but yeah if it's easy enough let's go for this.

If we decide to keep using profiles, another implication of this corner case would be that #8795 would require fractional counters for the calls that overflow the gas limit, which sounds error-prone.

Thinking about this for a while, I feel like fractional counters would be rather messy. Mostly because the profile data becomes less intuitive and profile-using code would become more complicated.
But if we want to account it in the profile somehow, maybe we could extend profiles to have an extra field to account for the overflowed gas. Either by not updating the last gas counter if we run out of gas and having a field like gas_burnt_while_failing: Compute on the profile which we populate when we run out of gas. Or perhaps even a field gas_not_burnt: Compute that represents negative gas. In the second case we would count the last gas cost in the profile but subtract the overflowed gas when we sum up the compute usage. (Note that we need to measure this field in Compute rather than Gas because this field would not contain the information for which gas cost it was added.)
This new field in the profile seems somewhat natural to me. It doesn't make things more complicated to understand, it just makes it more obvious that there can be corner cases around exceeded gas limits and those cases should be considered anyway.

Finally, we can also consider making a small protocol change (presumably without NEP) and change how we account for overflowed gas. Since the code follows the principle "burn gas before taking action" it would seem fine to not burn the gas of that last step at all. (Refund it instead.) This would generally simplify the gas logic and help us in this instance.

I have no strong opinion on which solution is the best here... I wish gas wasn't as complicated as it is :)

aborg-dev added a commit that referenced this issue Apr 17, 2023
This is a solution to #8908.

Now when we try to prepay the gas for some operation and run into the gas limit, both the gas and compute usage for that operation will be paid partially, where compute cost is calculated as `gas_burnt / gas_cost * compute_cost`.
@aborg-dev
Copy link
Contributor Author

This is fixed with #8914

near-bulldozer bot pushed a commit that referenced this issue Apr 18, 2023
This PR re-introduces changes from #8805 together with #8892 after fixing #8908:

The PR addresses #8265 by introducing aggregation of compute usage across all operations performed during chunk application and limiting this compute usage to 1s.

This should not change the behavior of nodes in the short run because compute costs match gas costs which is validated by the assert, so any discrepancy should be caught on canary nodes.
nikurt pushed a commit that referenced this issue Apr 18, 2023
This is a solution to #8908.

Now when we try to prepay the gas for some operation and run into the gas limit, both the gas and compute usage for that operation will be paid partially, where compute cost is calculated as `gas_burnt / gas_cost * compute_cost`.
nikurt pushed a commit that referenced this issue Apr 18, 2023
This PR re-introduces changes from #8805 together with #8892 after fixing #8908:

The PR addresses #8265 by introducing aggregation of compute usage across all operations performed during chunk application and limiting this compute usage to 1s.

This should not change the behavior of nodes in the short run because compute costs match gas costs which is validated by the assert, so any discrepancy should be caught on canary nodes.
nikurt pushed a commit that referenced this issue Apr 18, 2023
This PR re-introduces changes from #8805 together with #8892 after fixing #8908:

The PR addresses #8265 by introducing aggregation of compute usage across all operations performed during chunk application and limiting this compute usage to 1s.

This should not change the behavior of nodes in the short run because compute costs match gas costs which is validated by the assert, so any discrepancy should be caught on canary nodes.
nikurt pushed a commit that referenced this issue Apr 25, 2023
This PR re-introduces changes from #8805 together with #8892 after fixing #8908:

The PR addresses #8265 by introducing aggregation of compute usage across all operations performed during chunk application and limiting this compute usage to 1s.

This should not change the behavior of nodes in the short run because compute costs match gas costs which is validated by the assert, so any discrepancy should be caught on canary nodes.
nikurt pushed a commit that referenced this issue Apr 25, 2023
This PR re-introduces changes from #8805 together with #8892 after fixing #8908:

The PR addresses #8265 by introducing aggregation of compute usage across all operations performed during chunk application and limiting this compute usage to 1s.

This should not change the behavior of nodes in the short run because compute costs match gas costs which is validated by the assert, so any discrepancy should be caught on canary nodes.
nikurt pushed a commit that referenced this issue Apr 28, 2023
This is a solution to #8908.

Now when we try to prepay the gas for some operation and run into the gas limit, both the gas and compute usage for that operation will be paid partially, where compute cost is calculated as `gas_burnt / gas_cost * compute_cost`.
nikurt pushed a commit that referenced this issue Apr 28, 2023
This PR re-introduces changes from #8805 together with #8892 after fixing #8908:

The PR addresses #8265 by introducing aggregation of compute usage across all operations performed during chunk application and limiting this compute usage to 1s.

This should not change the behavior of nodes in the short run because compute costs match gas costs which is validated by the assert, so any discrepancy should be caught on canary nodes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-contract-runtime Area: contract compilation and execution, virtual machines, etc
Projects
None yet
Development

No branches or pull requests

3 participants