Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep the vyper compiler version signature in the runtime code #3549

Open
mds1 opened this issue Aug 3, 2023 · 8 comments
Open

Keep the vyper compiler version signature in the runtime code #3549

mds1 opened this issue Aug 3, 2023 · 8 comments

Comments

@mds1
Copy link

mds1 commented Aug 3, 2023

Simple Summary

Per @banteg on twitter I learned that vyper 0.3.10 plans to move the version metdata from the initcode to creation code (ref #3471)

I suggest keeping this in the initcode.

Motivation

  1. Finding initcode is much harder than finding runtime code. Given an address, it's trivial to query for it's runteime code using eth_getCode and parse the vyper version out of it. It's much harder to binary search across all blocks to find when a contract was deployed, then trace each transaction in that block to find creation code (You could of course rely on etherscan or other data providers, to provide this, but IMO minimizing reliance on other services and ensuring data is easy to find is a better approach)

  2. My understanding is that the rationale was that moving it to the creation code was either (1) cheaper, and/or (2) more robust since there's no initcode size limit.

    1. On (1), the cost difference here is negligible. In both cases the total bytecode being deployed is the same, and the only difference is a few less bytes being returned from the constructor. Compared to the cost of a contract deployment, this will be an unnoticeable savings
    2. On (2), EIP-3680 is now live which limits initcode size to 2x the runtime size, so if bytecode size was a limit moving it to creation code does help since 2x is a lot bigger. I'd argue the few bytes savings is likely to not be the deciding factor in many cases (there are of course exceptions)

My opinion is that the gas savings from (2) don't outweigh the benefits of having the version in the runtime code, as easy access to contract version numbers can be critical and time-sensitive during exploits. Additionally often times you might fetch code from an RPC and use tools like heimdall-rs or whatsabi to analyze it (there are many tools that operate on runtime code), so having more info in runtime code avoids introducing reliace on e.g. etherscan

From talking with @big-tech-sux, some arguments on the other side are that currently the version is at the end of the runtime code, but followed by immutables. So you still need the source, immutables layout, or a regex to parse the version. One idea is to move the version to the end after immutables, but it makes the implementation a bit more complicated and complexity compounds

Ultimately I just wanted to raise the other side of the decision here since I didn't see it much discussed in the PR, and IME often times you have runtime code from a node + local tools but don't want to introduce indexers to provide e.g. creation code.

Specification

I think just revert #3471? Basically, ensure the CBOR encoded vyper version (and other other metadata) is present in deployed code, not creation code.

Backwards Compatibility

My understanding is that 0.3.9 has the behavior being requested here, so there are no backwards compatibility issues unless 0.3.10 is released with the behavior from #3471

Dependencies

N/A

References

Twitter convos (nothing new here, all the same info as above, just linking the threads for completeness)

Copyright

Copyright and related rights waived via CC0

@banteg
Copy link

banteg commented Aug 4, 2023

i can appreciate that vyper optimizes for runtime over off-chain tooling, so just sharing some perspective.

i don't think the version being at a random position before immutables vs at the end poses a big problem, we can just regex.

but it not being in runtime code is more serious because you need to spent significantly more effort to find it. without using an explorer like etherscan to just look it up, you need to perform several steps:

  • binary search a block where the contract code appeared
  • then trace all txs from that block
  • finally find the initcode matching on the created address.

this would require an archive node with tracing capabilities which is not practical or even available on some networks. this would probably turn people to centralized services or indexers which is not optimal for decentralization.

quickly finding all contracts of a certain version could be crucial in certain situations like one that has just unfolded.

@charles-cooper
Copy link
Member

Another issue which might be relevant (although it can be considered orthogonally as an entirely separate issue) is whether it could be worth it to include length of the data + immutables sections in the runtime and/or initcode. This would make disassembling contracts easier, and for the issue noted above it would make it not matter whether the signature or immutable section comes first.

@charles-cooper
Copy link
Member

by the way, (per @banteg) otterscan ots_getContractCreator (which erigon supports and reth is adding support for) seems to be a fairly easy way to get contract creation trace. so it seems to me it increases overhead for offchain indexers and analysis, but not by a whole lot.

@charles-cooper
Copy link
Member

i was able to modify @pcaversaccio 's dune query (https://dune.com/pcaversaccio/vyper-deployment-statistics) to output all vyper contracts >= 0.3.4 based on creation trace, and the query took 60 seconds. so yes, this is a bit of a centralization risk, but also keeping track of all relevant chains is by its very nature somewhat of a centralized task, so i don't see it as too big of an issue.

@kuzdogan
Copy link

kuzdogan commented Dec 11, 2024

@charles-cooper to my knowledge Dune's "creation traces" does not have the creation code (initcode) but the runtime code (docs). So these results will not have Vyper contracts 0.3.10 and onwards

@pcaversaccio
Copy link
Collaborator

@charles-cooper to my knowledge Dune's "creation traces" does not have the creation code (initcode) but the runtime code (docs). So these results will not have Vyper contracts 0.3.10 and onwards

You're right - I pinged Dune with the feature request. Let's see.

@kuzdogan
Copy link

Now I have a better grasp of things, I'd argue another reason to keep the cbor metadata in runtime bytecode is because the information there is actually referencing the runtime bytecode. But it is only stored in the creation (init) bytecode.

E.g. this CBOR metadata (playground)

0x8418a7810a1860a1657679706572830004000014

is decoded as

[167, [10], 96, {"vyper": [0, 4, 0]}]

From what I can see 167 gives the length of the runtime bytecode. Besides 96, which is the length of the immutables, is also only valid in runtime bytecode. Initcode does not have any immutables in the code, only runtime bytecode has them.

@charles-cooper
Copy link
Member

Now I have a better grasp of things, I'd argue another reason to keep the cbor metadata in runtime bytecode is because the information there is actually referencing the runtime bytecode. But it is only stored in the creation (init) bytecode.

E.g. this CBOR metadata (playground)

0x8418a7810a1860a1657679706572830004000014

is decoded as

[167, [10], 96, {"vyper": [0, 4, 0]}]

From what I can see 167 gives the length of the runtime bytecode. Besides 96, which is the length of the immutables, is also only valid in runtime bytecode. Initcode does not have any immutables in the code, only runtime bytecode has them.

yes, that's on purpose. it saves on runtime bytecode (which is expensive and subject to a hard limit), and also makes the data fully unreachable from the runtime code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants