Change the constant optimizer to make use of `PUSH0` #14117

axic · 2023-04-13T00:25:33Z

Part of #14073.

This change only affects large copies using codecopy (which are supposedly rare).

(It will be fun changing this code for EOF, i.e. use DATACOPY or replace it entirely with a single DATALOADN.)

axic · 2023-04-13T00:26:28Z

libevmasm/ConstantOptimiser.cpp

+		AssemblyItems static copyRoutine{
+			// back up memory
+			// mload(0)
+			Instruction::PUSH0,


Could use u256(0) in these places, but the explicit instruction felt better for readability and ensuring the actualCopyRoutine[3] position is fixed.

We probably should use u256(0) in these places. The constructor of AssemblyItem should technically really assert against the instruction being any push or forward to the Push constructor.
Currently, the convention is "never use push as instruction in AssemblyItem", which was kind of hard to violate since all pushes had push data - but now we have AssemblyItem(u256(0)) != AssemblyItem(Instruction::PUSH0) - and should only ever use the former for consistency and for the other optimization steps to work properly.

Fair enough, changed it.

After I changed to u256(0) a few test cases fail because codecopy does not seem to be selected. That suggest the cost is more than expected, i.e. u256(0) generates larger code than push0. Need to investigate.

@ekpyron I won't have time to debug this for the next 2 weeks. Feel free to take it over.

Alright - even though it'd be nicer to have it, we don't absolutely need to change this in the immediate release with Shanghai support anyways, so we'll just postpone it for now (we aim to release tomorrow or Wednesday).

If I understood this right, codecopy cost is more than expected and thus is not being selected because when computing the byte size, PUSH0 is not taken in account.
At AssemblyItem::bytesRequired, the return value for any PUSHN is at minimum 2 but for PUSH0 should be always 1:

solidity/libevmasm/AssemblyItem.cpp

Lines 126 to 127 in 374a6fd

case Push:

return 1 + max<size_t>(1, numberEncodingSize(data()));

libevmasm/ConstantOptimiser.cpp

axic · 2023-04-17T15:51:19Z

libevmasm/ConstantOptimiser.cpp

+	if (m_params.evmVersion.hasPush0())
+	{
+		// This costs ~29 gas.
+		AssemblyItems static copyRoutine{


When I had a bug in this version some tests failed, so the CodeCopyMethod gets triggered, but someone should double check this before merging.

ekpyron

The fix does look like the right direction to go here. AssemblyItem:::bytesRequired should generally be fixed to account for PUSH0 and we should make sure it gets a correct (or reasonable) EVM version passed in at all times.

libevmasm/AssemblyItem.cpp

libevmasm/ConstantOptimiser.cpp

libevmasm/AssemblyItem.h

cameel · 2023-06-19T15:42:48Z

libevmasm/ConstantOptimiser.cpp

+	if (m_params.evmVersion.hasPush0())
+		actualCopyRoutine[3] = _assembly.newData(data);
+	else
+		actualCopyRoutine[4] = _assembly.newData(data);


It looks like it would be relatively easy for someone to change copyRoutine() and either forget the number or use a wrong one. I'd add an assert that the item we're replacing is really PushData just to be safe.

Actually, it looks like it would be even better to pass the number via an argument (with the u256(1) << 16 placeholder being the default) and avoid this brittle construct altogether. I wonder if there's any good reason it was done this way.

I added a parameter to copyRoutine. Since AssemblyItem is forward declared in ConstantOptimiser.h, we need to use a pointer.

libevmasm/ConstantOptimiser.cpp

ekpyron · 2023-09-18T14:10:50Z

test/cmdlineTests/optimize_full_storage_write/output

+      0x00
+      dup1
+      revert


So one thing is that this is no longer de-duplicated with tag_2 below for some reason (may be that the 0x00 is a PUSH0 in one case and a PUSH1 0x00 in the other?). Still a bit strange how the changes in this PR are causing this.

But another thing is: I'd have assumed that the libevmasm/CommonSubexpressionEliminator should actually be clever enough to transform 0x00 dup1 revert to 0x00 0x00 revert, if it gets the correct push0 pricing.

So, here's what happens in this case. The inliner sees that the cost of inlining is less than the pushtag 2, which was not the case before push0. So the code of tag_2 gets inlined. After that, the peephole optimizer finds that it can replace the code with revert dup1 push0, because its method DoublePush is not prepared to handle push0. Another problem is that the peephole optimizer was able to remove the pushtag 2 , jump next to tag 2, but it cannot do it now, because of the inlining.
Finally, the CSE is not able to transform dup1 push0 to push0 push0. I guess it needs a rule for that in SimplificationRules.
Not sure about what to do regarding the inliner situation.
I will try to make the CSE and peephole optimizer consider push0, although for the latter that depends on getting the evm version in a static function of the optimizer (DoublePush::applySimple).

Ah, right, that explains things - I hadn't considered that inlining would kick in since assuming a tag size of two bytes PUSH0 PUSH0 REVERT actually now is already the same size as PUSH<tag> JUMP! I mean, letting the inlining happen is just fine then...

The peephole optimizer should definitely consider push0 - and it also looks like some deduplication still doesn't happen correctly (but in cases in which it also didn't happen before), so we don't necessarily need to fix it here directly.

matheusaaguiar · 2024-06-27T20:04:29Z

I investigated the last 6 problematic cases of the hardhat tests and all cases are related to the introduction of the DeduplicateNextTagSizeX methods in 9b1792f.
They open up some possible optimizations that result in a different enough generated code that causes the failings in the hardhat stack trace tests since they rely on certain patterns being produced.
The first one that appears on the logs, for example, checks function test with b=false in the following solidity code:

pragma solidity ^0.8.0;

contract C {

  modifier m2(bool b)  {
    require(b);
    _;
  }

  function test(bool b) m1(b) m2(b) public {
    revert();
  }

  modifier m1(bool b)  {
    _;
  }

}

The stack trace expectation shows it expected the revert error raised from the require(b); at line 6:

{
  "transactions": [
    {
      "file": "c.sol",
      "contract": "C"
    },
    {
      "to": 0,
      "params": [false],
      "function": "test",
      "stackTrace": [
        {
          "type": "CALLSTACK_ENTRY",
          "sourceReference": {
            "contract": "C",
            "file": "c.sol",
            "function": "test",
            "line": 10
          }
        },
        {
          "type": "REVERT_ERROR",
          "sourceReference": {
            "contract": "C",
            "file": "c.sol",
            "function": "m2",
            "line": 6
          }
        }
      ]
    }
  ]
}

The code generated by the compiler before this PR contains the part related to that:

    tag_7:
        /* "c.sol":119:120  b */
      dup1
        /* "c.sol":125:126  b */
      dup2
        /* "c.sol":76:77  b */
      dup1
        /* "c.sol":68:78  require(b) */
      tag_10
      jumpi
      0x00
      dup1
      revert
    tag_10:
        /* "c.sol":141:149  revert() */
      0x00
      dup1
      revert

But the code generated with the introduction of this PR is optimized further and removes the asm related to the require, according to these steps:
(Already assuming all revert dup1 0x00 transformed to revert 0x00 0x00 because it is cheaper than dup1)

Remove revert 0x00 0x00 before tag_10 which does exactly the same (PeepholeOptimiser::DeduplicateNextTagSize3)
Remove jumpi tag_10 because now it is right before tag_10. Also note that as a consequence dup1 is also removed since it is the condition of the jumpi (PeepholeOptimiser::JumpToNext)
Remove tag_10 (only the tag itself, not the instructions following it) because its only reference was removed in step 2 (JumpdestRemover::optimise)
The remaining dup2 dup1 before revert 0x00 0x00 are removed (PeepholeOptimiser::OpReturnRevert)

The final result contains only the code corresponding to the revert in line 11:

    tag_7:
        /* "c.sol":141:149  revert() */
      revert(0x00, 0x00)

The require was completely removed from the asm, which makes sense from an optimization point of view, and that causes a mismatch with the expected stack trace.
The other failing tests are similar cases where a specific snippet of code is optimized out while that was not the case before.

.circleci/config.yml

ekpyron

Consider this comment a soft-approval. The logic looks sound now, all testing issues seem to be resolved. Missing is only changelog entries (for the constant optimiser taking into account push0 and for the peephole optimizer optimizing identical code snippets that terminate if they occur one after the other), and potentially a bit of commit cleanup.

matheusaaguiar · 2024-07-11T16:31:34Z

Gas cost benchmarks

`ir-no-optimize`

project	bytecode_size	deployment_gas	method_gas
brink	`+0%`
colony	`+0.04% ❌`
elementfi	`+0%`
ens	`+0%`
euler
gnosis
gp2	`-0.02% ✅`
pool-together	`0%`
uniswap	`0%`
yield_liquidator	`+0%`	`+0%`	`0%`
zeppelin

`ir-optimize-evm+yul`

project	bytecode_size	deployment_gas	method_gas
brink	`0%`
colony	`+0.02% ❌`
elementfi	`-0%`
ens	`-0%`	`0%`	`-0%`
euler	`-0%`
gnosis
gp2	`-0.03% ✅`
pool-together	`+0%`
uniswap	`-0%`
yield_liquidator	`+0.27% ❌`	`+0.3% ❌`	`-0.01% ✅`
zeppelin	`-0.01% ✅`	`-0.01% ✅`	`+0.13% ❌`

`ir-optimize-evm-only`

project	bytecode_size	deployment_gas	method_gas
brink	`+0.01% ❌`
colony	`+0.01% ❌`
elementfi	`-0.01% ✅`
ens	`-0%`	`+0.01% ❌`	`0%`
euler
gnosis
gp2	`+0.01% ❌`
pool-together	`+0.01% ❌`
uniswap	`-0.01% ✅`
yield_liquidator	`+0%`	`+0%`	`0%`
zeppelin	`-0.01% ✅`

`legacy-no-optimize`

project	bytecode_size	deployment_gas	method_gas
brink	`+0.03% ❌`
colony	`+0.04% ❌`
elementfi	`+0.1% ❌`
ens	`+0.04% ❌`
euler	`+0.04% ❌`
gnosis	`+0.05% ❌`
gp2	`-0%`
pool-together	`+0.06% ❌`
uniswap	`+0.02% ❌`
yield_liquidator	`+0.04% ❌`	`+0.04% ❌`	`-0.01% ✅`
zeppelin	`+0.06% ❌`	`+3.36% ❌`	`+0.01% ❌`

`legacy-optimize-evm+yul`

project	bytecode_size	deployment_gas	method_gas
brink	`0%`
colony	`+0.02% ❌`
elementfi	`+0.01% ❌`
ens	`+0.02% ❌`	`+0%`	`-0%`
euler	`+0.02% ❌`
gnosis	`0%`
gp2	`-0.06% ✅`
pool-together	`+0.01% ❌`
uniswap	`-0%`
yield_liquidator	`0%`	`+0%`	`-0%`
zeppelin	`-0.02% ✅`	`-0%`	`+0.07% ❌`

`legacy-optimize-evm-only`

project	bytecode_size	deployment_gas	method_gas
brink	`0%`
colony	`+0.01% ❌`
elementfi	`+0.01% ❌`
ens	`+0.02% ❌`	`-0%`	`-0%`
euler	`+0.04% ❌`
gnosis	`0%`
gp2	`-0.05% ✅`
pool-together	`-0%`
uniswap	`-0%`
yield_liquidator	`0%`	`-0%`	`-0%`
zeppelin	`-0.01% ✅`	`-0%`	`-0%`

!V = version mismatch
!B = no value in the "before" version
!A = no value in the "after" version
!T = one or both values were not numeric and could not be compared
-0 = very small negative value rounded to zero
+0 = very small positive value rounded to zero

github-actions · 2024-07-26T12:04:53Z

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

axic force-pushed the push0-optimisations branch from b62a72e to 2c1d785 Compare April 13, 2023 00:25

axic commented Apr 13, 2023

View reviewed changes

libevmasm/ConstantOptimiser.cpp Outdated Show resolved Hide resolved

NunoFilipeSantos assigned axic Apr 13, 2023

axic requested review from cameel and ekpyron April 15, 2023 11:20

axic commented Apr 17, 2023

View reviewed changes

axic force-pushed the push0-optimisations branch 2 times, most recently from 25ac63f to 6c756d2 Compare April 17, 2023 16:44

axic mentioned this pull request Apr 17, 2023

Shanghai Support #14073

Closed

12 tasks

NunoFilipeSantos added this to the 0.8.20 milestone Apr 26, 2023

ekpyron modified the milestones: 0.8.20, 0.8.21 May 8, 2023

NunoFilipeSantos requested a review from matheusaaguiar May 15, 2023 13:13

matheusaaguiar force-pushed the push0-optimisations branch from bacab83 to ca1394e Compare June 19, 2023 04:11

ekpyron reviewed Jun 19, 2023

View reviewed changes

libevmasm/AssemblyItem.cpp Outdated Show resolved Hide resolved

libevmasm/ConstantOptimiser.cpp Outdated Show resolved Hide resolved

libevmasm/AssemblyItem.h Outdated Show resolved Hide resolved

cameel reviewed Jun 19, 2023

View reviewed changes

matheusaaguiar self-assigned this Jun 22, 2023

cameel added the optimizer label Jun 23, 2023

matheusaaguiar force-pushed the push0-optimisations branch from 443f226 to 43ddf2d Compare July 5, 2023 14:54

r0qs force-pushed the push0-optimisations branch from 3d2fb1f to d6792cf Compare July 17, 2023 08:47

NunoFilipeSantos modified the milestones: 0.8.21, 0.8.22 Jul 17, 2023

matheusaaguiar force-pushed the push0-optimisations branch 2 times, most recently from 9df26c1 to 0831f1b Compare August 24, 2023 18:34

ekpyron reviewed Sep 18, 2023

View reviewed changes

ekpyron modified the milestones: 0.8.22, 0.8.23 Oct 16, 2023

ekpyron removed this from the 0.8.24 milestone Dec 20, 2023

ekpyron mentioned this pull request Dec 20, 2023

Support for MCOPY #14741

Closed

matheusaaguiar force-pushed the push0-optimisations branch from c2d8b49 to 7178bce Compare June 26, 2024 12:49

matheusaaguiar removed the stale The issue/PR was marked as stale because it has been open for too long. label Jun 26, 2024

matheusaaguiar force-pushed the push0-optimisations branch from 0de5d5e to 8cdc70f Compare June 27, 2024 17:32

matheusaaguiar mentioned this pull request Jun 27, 2024

Stack trace tests expectations failing upon introducing new optimization in the Solidity compiler NomicFoundation/hardhat#5443

Closed

matheusaaguiar force-pushed the push0-optimisations branch 2 times, most recently from 1e39984 to c5e2877 Compare June 30, 2024 14:35

r0qs reviewed Jul 1, 2024

View reviewed changes

.circleci/config.yml Show resolved Hide resolved

ekpyron reviewed Jul 1, 2024

View reviewed changes

ekpyron mentioned this pull request Jul 1, 2024

BlockHasher: Do not hash literal kind #15231

Merged

matheusaaguiar force-pushed the push0-optimisations branch 2 times, most recently from 7924c39 to d7fb2a3 Compare July 3, 2024 23:24

ekpyron previously approved these changes Jul 4, 2024

View reviewed changes

github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Jul 26, 2024

matheusaaguiar removed the stale The issue/PR was marked as stale because it has been open for too long. label Jul 26, 2024

matheusaaguiar dismissed ekpyron’s stale review via 0aa28ff August 1, 2024 18:10

matheusaaguiar force-pushed the push0-optimisations branch from d7fb2a3 to 0aa28ff Compare August 1, 2024 18:10

matheusaaguiar previously approved these changes Aug 1, 2024

View reviewed changes

matheusaaguiar dismissed their stale review via 7431bcb August 1, 2024 19:22

matheusaaguiar force-pushed the push0-optimisations branch from 0aa28ff to 7431bcb Compare August 1, 2024 19:22

axic and others added 4 commits August 2, 2024 13:41

Change the constant optimizer to make use of PUSH0

860f0d4

Add new Peephole Optimiser method (deduplicateNextTag)

6e7d94e

Update tests

7e67811

workaround to skip failing hardhat tests

a500a63

matheusaaguiar force-pushed the push0-optimisations branch from 7431bcb to a500a63 Compare August 2, 2024 16:43

ekpyron approved these changes Aug 5, 2024

View reviewed changes

ekpyron merged commit 5dbaa13 into develop Aug 5, 2024
72 checks passed

ekpyron deleted the push0-optimisations branch August 5, 2024 15:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change the constant optimizer to make use of `PUSH0` #14117

Change the constant optimizer to make use of `PUSH0` #14117

axic commented Apr 13, 2023 •

edited

Loading

axic Apr 13, 2023

ekpyron Apr 17, 2023

axic Apr 17, 2023

axic Apr 20, 2023

axic May 8, 2023

ekpyron May 8, 2023

matheusaaguiar Jun 19, 2023

axic Apr 17, 2023

ekpyron left a comment

cameel Jun 19, 2023

matheusaaguiar Aug 17, 2023 •

edited

Loading

ekpyron Sep 18, 2023

matheusaaguiar Feb 7, 2024 •

edited

Loading

ekpyron Feb 19, 2024

matheusaaguiar commented Jun 27, 2024 •

edited

Loading

ekpyron left a comment

matheusaaguiar commented Jul 11, 2024

`ir-no-optimize`

`ir-optimize-evm+yul`

`ir-optimize-evm-only`

`legacy-no-optimize`

`legacy-optimize-evm+yul`

`legacy-optimize-evm-only`

github-actions bot commented Jul 26, 2024

	case Push:
	return 1 + max<size_t>(1, numberEncodingSize(data()));

Change the constant optimizer to make use of PUSH0 #14117

Change the constant optimizer to make use of PUSH0 #14117

Conversation

axic commented Apr 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekpyron left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matheusaaguiar Aug 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matheusaaguiar Feb 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matheusaaguiar commented Jun 27, 2024 • edited Loading

ekpyron left a comment

Choose a reason for hiding this comment

matheusaaguiar commented Jul 11, 2024

ir-no-optimize

ir-optimize-evm+yul

ir-optimize-evm-only

legacy-no-optimize

legacy-optimize-evm+yul

legacy-optimize-evm-only

github-actions bot commented Jul 26, 2024

Change the constant optimizer to make use of `PUSH0` #14117

Change the constant optimizer to make use of `PUSH0` #14117

axic commented Apr 13, 2023 •

edited

Loading

matheusaaguiar Aug 17, 2023 •

edited

Loading

matheusaaguiar Feb 7, 2024 •

edited

Loading

matheusaaguiar commented Jun 27, 2024 •

edited

Loading

`ir-no-optimize`

`ir-optimize-evm+yul`

`ir-optimize-evm-only`

`legacy-no-optimize`

`legacy-optimize-evm+yul`

`legacy-optimize-evm-only`