Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forge gas reporting is misleading due to obfuscation of transaction overhead #6578

Closed
1 of 2 tasks
emo-eth opened this issue Dec 12, 2023 · 4 comments
Closed
1 of 2 tasks
Labels
T-bug Type: bug
Milestone

Comments

@emo-eth
Copy link
Contributor

emo-eth commented Dec 12, 2023

Component

Forge

Have you ensured that all of these are up to date?

  • Foundry
  • Foundryup

What version of Foundry are you on?

forge 0.2.0 (0ae39ea 2023-12-11T00:27:32.487222000Z)

What command(s) is the bug in?

forge test

Operating System

macOS (Apple Silicon)

Describe the bug

Forge incorrectly (or, depending on your point of view, inconsistently) reports gas usage whenever the EVM incurs a gas refund.

This is due to the fact that gas refunds are capped at 1/5 of total transaction gas usage and credited at the end of the transaction. However, Forge simply subtracts the 21000+calldata fee overhead from the total gas usage (post-refund) when reporting.

For complicated tests that incur gas refunds (ie, tests that use a lot of gas compared to the tx overhead), reported numbers should be ~accurate. Simpler tests, however, can see very "inaccurate" reported numbers. See the comments in the included examples.

My recommendation: Forge should subtract the tx overhead from the Gas context before calculating the total refund, or else completely rethink how gas is reported. IMHO, using the tx-level gas usage for refund calculations, when Forge tests may contain many logical "transactions" is extremely confusing. Quietly including tx-overhead in refund calculations also makes manual gas accounting much more confusing.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.17;

import {Test} from "forge-std/Test.sol";

contract RefundTest is Test {
    uint256 i;
    // use low-level calls to avoid solc-injected EXTCODESIZE checks
    uint256 constant pauseSelector = 0xd1a5b36f;
    uint256 constant resumeSelector = 0x2bcd50e0;

    function testNoRefundWithPause() public {
        address VM = VM_ADDRESS;
        assembly {
            mstore(0, pauseSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
            sstore(i.slot, 5)
            mstore(0, resumeSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
        }
        burnGas(10000);
        // reported: ~13000 (VM_ADDRESS is not warm, and two calls are made)
    }

    function testBadRefund() public {
        address VM = VM_ADDRESS;
        assembly {
            mstore(0, pauseSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
            sstore(i.slot, 5)
            mstore(0, resumeSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
        }
        burnGas(10000);
        assembly {
            sstore(i.slot, 0)
        }
        // business logic: ~13000
        // tx overhead: ~21000
        // total: ~34000
        // gas refund: 19900 > (34000 / 5 ) ? 34000 / 5 : 19900
        // reported: ~6200 = ~13000 - ~34000 / 5
    }

    function testNormalRefund() public {
        address VM = VM_ADDRESS;
        assembly {
            pop(staticcall(0, VM, 0, 0, 0, 0))
            sstore(i.slot, 5)
            pop(staticcall(0, VM, 0, 0, 0, 0))
        }
        burnGas(10000);
        assembly {
            sstore(i.slot, 0)
        }
        // business logic: ~35000
        // tx overhead: ~21000
        // total: ~56000
        // gas refund: 19900 > (56000 / 5 ) ? 56000 / 5 : 19900
        // reported: ~23800 = 35000 - (56000) / 5
    }

    function testNoRefund() public {
        address VM = VM_ADDRESS;
        assembly {
            pop(staticcall(0, VM, 0, 0, 0, 0))
            sstore(i.slot, 5)
            pop(staticcall(0, VM, 0, 0, 0, 0))
        }
        burnGas(10000);
        // reported: ~35000
    }

    function testOverhead() public {}

    function burnGas(int256 amount) internal view {
        unchecked {
            uint256 start = gasleft();
            uint256 x;
            while (gasleft() > start - uint256(amount)) {
                x += 1;
            }
        }
    }
}

foundry.toml:

[profile.default]
src = 'src'
out = 'out'
libs = ['lib']
optimizer = true
optimizer_runs = 9_999_999
# via_ir = true
solc_version = '0.8.22'
bytecode_hash = 'none'
evm_version = 'shanghai'
@emo-eth emo-eth added the T-bug Type: bug label Dec 12, 2023
@gakonst gakonst added this to Foundry Dec 12, 2023
@github-project-automation github-project-automation bot moved this to Todo in Foundry Dec 12, 2023
@Vectorized
Copy link

Suggestion:

vm.startTxGasMetering(optionalAccessList);

uint txGasUsed = vm.stopTxGasMetering();

@plotchy
Copy link
Contributor

plotchy commented Dec 12, 2023

Im not a big user of gas snapshots so don't have much skin in the game, but I do have some thoughts.

For user actions that become transactions, things like cold slots + refunds + accessLists are important to have correct to understand true cost.

  • using forge script style tx publishing is accurate, but afaik not possible to incorporate directly from your test files.

For developer iteration cycles, convenience and compartmentalized execution is important.

  • current gas metering within test functions is useful to iterate on snippets and 'pure' logic and may not make sense as txs. Ideally i wouldn't want any dispatch logic + tx overhead costs to increase the magnitude of my costs when iterating on this.

It seems like using forge script in all cases would be most accurate but applied universally could muddy the testing UX since not all tests make sense to perform as transactions. On the other hand, getting gas metering correct within tests seems extremely difficult when accounting for cold/warm + refunds + access lists.

TBH, it's difficult for me to intuit exactly how forge meters gas when things like cold/warm + setUp() + VM calls + test-fn-dispatch exist. I think the proposition here by @emo-eth has the same intention as the simplifying of dispatch logic + tx overhead costs when acting on pure logic. I like the direction, yet think this would add another layer that brings complexity to maintaining forge and muddies the intuition for what gas number is being output in tests. In reality thats probably a worthy tradeoff for better relative comparison.

@Evalir
Copy link
Member

Evalir commented Apr 9, 2024

Wonder if we can consider this closed with the addition of the --isolate flag?

@zerosnacks
Copy link
Member

Tentatively marking as resolved now that we have the --isolate flag that is enabled by default with --gas-report

Feel free to re-open if there are still actionable items here (cc @Vectorized / @emo-eth)

@zerosnacks zerosnacks closed this as not planned Won't fix, can't repro, duplicate, stale Jul 5, 2024
@jenpaff jenpaff moved this from Todo to Completed in Foundry Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-bug Type: bug
Projects
Archived in project
Development

No branches or pull requests

5 participants