Forge gas reporting is misleading due to obfuscation of transaction overhead #6578

emo-eth · 2023-12-12T07:15:44Z

Component

Forge

Have you ensured that all of these are up to date?

Foundry
Foundryup

What version of Foundry are you on?

forge 0.2.0 (0ae39ea 2023-12-11T00:27:32.487222000Z)

What command(s) is the bug in?

forge test

Operating System

macOS (Apple Silicon)

Describe the bug

Forge incorrectly (or, depending on your point of view, inconsistently) reports gas usage whenever the EVM incurs a gas refund.

This is due to the fact that gas refunds are capped at 1/5 of total transaction gas usage and credited at the end of the transaction. However, Forge simply subtracts the 21000+calldata fee overhead from the total gas usage (post-refund) when reporting.

For complicated tests that incur gas refunds (ie, tests that use a lot of gas compared to the tx overhead), reported numbers should be ~accurate. Simpler tests, however, can see very "inaccurate" reported numbers. See the comments in the included examples.

My recommendation: Forge should subtract the tx overhead from the Gas context before calculating the total refund, or else completely rethink how gas is reported. IMHO, using the tx-level gas usage for refund calculations, when Forge tests may contain many logical "transactions" is extremely confusing. Quietly including tx-overhead in refund calculations also makes manual gas accounting much more confusing.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.17;

import {Test} from "forge-std/Test.sol";

contract RefundTest is Test {
    uint256 i;
    // use low-level calls to avoid solc-injected EXTCODESIZE checks
    uint256 constant pauseSelector = 0xd1a5b36f;
    uint256 constant resumeSelector = 0x2bcd50e0;

    function testNoRefundWithPause() public {
        address VM = VM_ADDRESS;
        assembly {
            mstore(0, pauseSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
            sstore(i.slot, 5)
            mstore(0, resumeSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
        }
        burnGas(10000);
        // reported: ~13000 (VM_ADDRESS is not warm, and two calls are made)
    }

    function testBadRefund() public {
        address VM = VM_ADDRESS;
        assembly {
            mstore(0, pauseSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
            sstore(i.slot, 5)
            mstore(0, resumeSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
        }
        burnGas(10000);
        assembly {
            sstore(i.slot, 0)
        }
        // business logic: ~13000
        // tx overhead: ~21000
        // total: ~34000
        // gas refund: 19900 > (34000 / 5 ) ? 34000 / 5 : 19900
        // reported: ~6200 = ~13000 - ~34000 / 5
    }

    function testNormalRefund() public {
        address VM = VM_ADDRESS;
        assembly {
            pop(staticcall(0, VM, 0, 0, 0, 0))
            sstore(i.slot, 5)
            pop(staticcall(0, VM, 0, 0, 0, 0))
        }
        burnGas(10000);
        assembly {
            sstore(i.slot, 0)
        }
        // business logic: ~35000
        // tx overhead: ~21000
        // total: ~56000
        // gas refund: 19900 > (56000 / 5 ) ? 56000 / 5 : 19900
        // reported: ~23800 = 35000 - (56000) / 5
    }

    function testNoRefund() public {
        address VM = VM_ADDRESS;
        assembly {
            pop(staticcall(0, VM, 0, 0, 0, 0))
            sstore(i.slot, 5)
            pop(staticcall(0, VM, 0, 0, 0, 0))
        }
        burnGas(10000);
        // reported: ~35000
    }

    function testOverhead() public {}

    function burnGas(int256 amount) internal view {
        unchecked {
            uint256 start = gasleft();
            uint256 x;
            while (gasleft() > start - uint256(amount)) {
                x += 1;
            }
        }
    }
}

foundry.toml:

[profile.default]
src = 'src'
out = 'out'
libs = ['lib']
optimizer = true
optimizer_runs = 9_999_999
# via_ir = true
solc_version = '0.8.22'
bytecode_hash = 'none'
evm_version = 'shanghai'

The text was updated successfully, but these errors were encountered:

Vectorized · 2023-12-12T07:27:24Z

Suggestion:

vm.startTxGasMetering(optionalAccessList);

uint txGasUsed = vm.stopTxGasMetering();

plotchy · 2023-12-12T21:37:40Z

Im not a big user of gas snapshots so don't have much skin in the game, but I do have some thoughts.

For user actions that become transactions, things like cold slots + refunds + accessLists are important to have correct to understand true cost.

using forge script style tx publishing is accurate, but afaik not possible to incorporate directly from your test files.

For developer iteration cycles, convenience and compartmentalized execution is important.

current gas metering within test functions is useful to iterate on snippets and 'pure' logic and may not make sense as txs. Ideally i wouldn't want any dispatch logic + tx overhead costs to increase the magnitude of my costs when iterating on this.

It seems like using forge script in all cases would be most accurate but applied universally could muddy the testing UX since not all tests make sense to perform as transactions. On the other hand, getting gas metering correct within tests seems extremely difficult when accounting for cold/warm + refunds + access lists.

TBH, it's difficult for me to intuit exactly how forge meters gas when things like cold/warm + setUp() + VM calls + test-fn-dispatch exist. I think the proposition here by @emo-eth has the same intention as the simplifying of dispatch logic + tx overhead costs when acting on pure logic. I like the direction, yet think this would add another layer that brings complexity to maintaining forge and muddies the intuition for what gas number is being output in tests. In reality thats probably a worthy tradeoff for better relative comparison.

Evalir · 2024-04-09T13:47:39Z

Wonder if we can consider this closed with the addition of the --isolate flag?

zerosnacks · 2024-07-05T13:09:26Z

Tentatively marking as resolved now that we have the --isolate flag that is enabled by default with --gas-report

Feel free to re-open if there are still actionable items here (cc @Vectorized / @emo-eth)

emo-eth added the T-bug Type: bug label Dec 12, 2023

gakonst added this to Foundry Dec 12, 2023

github-project-automation bot moved this to Todo in Foundry Dec 12, 2023

emo-eth mentioned this issue Dec 12, 2023

feat(cheatcodes): add vm.meterTxGas cheatcode #6590

Closed

Philogy mentioned this issue Jan 26, 2024

Add an "Authentic" execution mode for tests & scripts #6910

Closed

Evalir added this to the v1.0.0 milestone Feb 25, 2024

zerosnacks closed this as not planned Won't fix, can't repro, duplicate, stale Jul 5, 2024

zerosnacks closed this as completed Jul 5, 2024

jenpaff moved this from Todo to Completed in Foundry Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forge gas reporting is misleading due to obfuscation of transaction overhead #6578

Forge gas reporting is misleading due to obfuscation of transaction overhead #6578

emo-eth commented Dec 12, 2023

Vectorized commented Dec 12, 2023

plotchy commented Dec 12, 2023

Evalir commented Apr 9, 2024

zerosnacks commented Jul 5, 2024

Forge gas reporting is misleading due to obfuscation of transaction overhead #6578

Forge gas reporting is misleading due to obfuscation of transaction overhead #6578

Comments

emo-eth commented Dec 12, 2023

Component

Have you ensured that all of these are up to date?

What version of Foundry are you on?

What command(s) is the bug in?

Operating System

Describe the bug

Vectorized commented Dec 12, 2023

plotchy commented Dec 12, 2023

Evalir commented Apr 9, 2024

zerosnacks commented Jul 5, 2024