Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify Client Versions on Engine API #517

Merged
merged 11 commits into from
Feb 13, 2024

Conversation

ethDreamer
Copy link
Contributor

@ethDreamer ethDreamer commented Jan 26, 2024

By analyzing the structure of beacon blocks on the network, we are able to obtain fairly accurate data on consensus layer client diversity. Unfortunately, do to the fact that the overwhelming majority of validators use mev-boost, their execution clients do not leave any fingerprint behind in block proposals. Thus we are forced to rely on limited self-reporting data from staking pools. Many pools do not participate, and we often have outdated statistics for the pools that do. Worse yet, we have no data on client diversity for home stakers.

This PR can change that by allowing consensus clients to learn which execution client they are connected with.

Consensus clients can then embed this in their graffiti field by default when the user doesn't bother to set it. A quick survey of recent proposal graffiti reveals that:

already embed their client and version by default. It would be great to add the execution client to this. Perhaps prysm could be convinced to join as well.

An analysis of ~2000 recent blocks indicated that nearly half of all validators don't bother to change their graffiti from the default so the potential to gather data here is huge.

Copy link
Contributor

@dapplion dapplion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this can be faked, it's a strict improvement over status quo with no downsides, so I support

Copy link

@jflo jflo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be easy to implement, and improve a clear and present problem on the network.

Copy link
Member

@lightclient lightclient left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally support this. I don't think we should expose under the provided name though. I would rather expose under engine_*. For one, the web3_* namespace isn't defined anywhere in this repo. Second, if we account for the possibilities of specialized engine server (thinking like a client multiplexer) then this response is extremely engine oriented.

If that sounds reasonable, can you update this PR with the name and add the schema for the method in both the openrpc spec and in engine common spec? It should look similar to engine_exchangeCapabilities I think.

@garyschulte
Copy link

garyschulte commented Jan 26, 2024

Adding this to the engine api is definitely an improvement. In order to fit within the graffiti, we should specify a field size limit or a strategy to encode the version info.

Otherwise we might end up with responses like: Lighthouse/v4.5.0-441fc16 :besu/v24.1.2-dev-8407b9e7/linux-x86_64/openjdk-java-21

@fjl
Copy link
Collaborator

fjl commented Jan 27, 2024

I'm personally more in favor of standardizing web3_clientVersion, because it already exists in clients.

@ethDreamer
Copy link
Contributor Author

ethDreamer commented Jan 29, 2024

In light of comments received so far I've pushed an alternative specification called engine_clientVersionV1 which is more comprehensive. There are a couple things to be decided:

  1. Do we reuse web3_clientVersion or choose the more comprehensive engine_clientVersionV1?
  2. Should we require this method be supported instead of recommending its support? (SHOULD vs MUST)
  3. Do we agree on the abbreviations for the ClientCode?
  4. Do we accept using the first 4 bytes of the commit hash as a short-hand for version?

Personally I lean towards taking engine_clientVersionV1 and making it mandatory if there aren't objections. If we're not going to take the easy route and just reuse web3_clientVersion then we might as well make a method that accomplishes what we're really trying to do here (get better measurements of EL client diversity). To that end, we require a standardized shorthand for specifying both clients in the limited space of block graffiti (32 bytes).

The definitions of ClientCode specified here allow us to standardize how we encode any client pairing and specify both execution and consensus client versions inside the block graffiti within just 20 bytes. For example:

LH1be52536BU0f91a674

If desired, the space could be further reduced so that the bytes of the commit hash are embedded directly into the graffiti bytes (allowing the full consensus and execution client versions to be specified in 12 bytes).

Standardizing the version specifications this way makes graffiti analysis easy regardless of what client pairs are used. Based on my testing, geth, nethermind, and besu already have the commit hash embedded in their binaries when they list the client version so it shouldn't be difficult for them to build it in here. I can't speak for the other clients though.

Side note: by design, none of the proposed client codes are valid hex so they they won't be confused with the commit hash.

Copy link
Contributor

@dapplion dapplion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standarizing the version format return is a great improvement. I would switch to name to "Client Version" engine_clientVersion instead of using the term identification. Same meaning but better memetics

Copy link
Contributor

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new dedicated method, and agree with Lion that it should be called engine_clientVersion.

I'm in favour of making it mandatory after an appropriate adoption period.

src/engine/identification.md Outdated Show resolved Hide resolved
Also specify Grandine abbreviation and accomodate other versioning
systems.
@ethDreamer ethDreamer changed the title Expose web3_clientVersion on Engine API Specify Client Versions on Engine API Jan 29, 2024
@rkrasiuk
Copy link
Contributor

Very supportive of this change. Agree with previous comments around naming. Unsure if we should introduce versioning for identification. we already have unversioned engine_exchangeCapabilities, might as well drop v1 and simply have engine_clientVersion

@ethDreamer
Copy link
Contributor Author

ethDreamer commented Jan 29, 2024

Unsure if we should introduce versioning for identification. we already have unversioned engine_exchangeCapabilities, might as well drop v1 and simply have engine_clientVersion

I believe engine_exchangeCapabilities is the only unversioned method because we won't allow it to ever change. If execution clients began supporting a new version of exchangeCapabilities, then consensus clients would have to do trial and error to determine which version to call, which defeats the purpose of having a "what methods do you support" method.

This doesn't necessarily mean that we couldn't also agree to never allow the engine_clientVersion method to change.

@StefanBratanov
Copy link

Supportive of this change. Maybe the method could be renamed to engine_exchangeClientVersion similar to engine_exchangeCapabilities since we are essentially exchanging the versions.

Copy link

@rubo rubo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support this proposal.

Not as a part of this one, but when it comes to versions, I'd also like to have a standardized version format for clients. Something like what browsers have for their user agent string defined in RFC 9110. That would be helpful for the network stats handling as well. Currently, we mostly have name/version/platform/lang, but it slightly varies from client to client.

@lightclient
Copy link
Member

Is there an advantage of being prescriptive about the format the client returns it's version in? I was imagining something much closer to web3_clientVersion where any value would be accepted by CL and incorporated into the graffiti.

I guess there isn't much downside as it is mainly just upfront cost of spec'ing things out.

@garyschulte
Copy link

Is there an advantage of being prescriptive about the format the client returns it's version in? I was imagining something much closer to web3_clientVersion where any value would be accepted by CL and incorporated into the graffiti.

Having a conformed format will only help in identification, and in 'economy of graffiti'. Ensuring there is a predictable portion of graffiti consumed by client identification makes it more palatable IMO. The primary downside is just the gatekeeping required to maintain the list.

I like the human readable and more verbose bits, but I think that might only be useful in CL logs, since it is behind JWT secured endpoint.

@rolfyone
Copy link

It seems like the other missing part that would be nice in graffiti would be the builder and version if used (mostly used, but not always)
The encoding makes sense to fit inside graffiti bytes. not sure what % of blocks use default graffiti and whether this will be useful ultimately...

@michaelsproul
Copy link
Contributor

@rolfyone We already have data on the builders because they fill the execution payload's extraData field. E.g. this block has rsync-builder.xyz in the extra data: https://beaconcha.in/slot/8312808. The relays also provide data APIs that let us map blocks & builders to relays. Often there are multiple relays that will produce/publish each builder payload. Some of these affinities are displayed on sites like https://mevboost.pics/.

The only thing the local BN could read would be a list of relays from mev-boost or its own config. This would be 1) too long to include in graffiti and 2) redundant, given the above.

@ethDreamer
Copy link
Contributor Author

Okay I just want to take the temperature of the room. Please react to this to vote:

❤️ - vote for reusing web3_clientVersion
🎉 - vote for engine_clientVersionV1
👍 - vote for adopting engine_clientVersionV1 but renaming to engine_exchangeClientVersionV1

@lightclient
Copy link
Member

image

@kasey
Copy link

kasey commented Jan 30, 2024

Prysm supports this proposal.

I filed this issue to start recording our user-agent info in graffiti by default: prysmaticlabs/prysm#13558

@ethDreamer
Copy link
Contributor Author

ethDreamer commented Feb 2, 2024

If we sacrifice human readibility we could have the first byte representing clients (4bit cl, 4bit el) + 4bytes (2 cl, 2 el) for commits, so we end up with 5 versioning + 27 user msg

That's a nice option if we want to always provide version information while taking up minimal space. But it does limit us to only 16 execution / consensus clients, and based on my conversations a lot of people seem to prefer readability.

In practice, the versioning information is really just nice to have for some debugging cases. But it is not strictly necessary. It is much less important than knowing the implementation itself for the purposes of measuring EL client diversity. That's why I like the flexible standard because it allows users 28 characters if they really want it, while preserving readability, without limiting EL/CL client implementations, and it preserves the version information for debugging in the vast majority of cases.

@ethDreamer
Copy link
Contributor Author

I've renamed the method to getClientVersionV1 in accordance with @mkalinin's suggestion. I've also decided to accommodate multiplexers by returning an array of ClientVersionV1 objects for two reasons:

  1. We require some way to indicate to the consensus client that a multiplexer is being used so that the data can be excluded from graffiti. So we may as well indicate this by receiving more than one ClientVersionV1 object
  2. Knowing each client version in a multiplexer scenario may be useful in the future for other methods of measuring client diversity which do not use graffiti.

Copy link
Contributor

@smartprogrammer93 smartprogrammer93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 🙏

@ethDreamer
Copy link
Contributor Author

LAST CALL FOR CONCERNS

I believe we've addressed all concerns that have been raised at this point. The core idea of this PR has widespread support and most disagreement is minor & related to small implementation details. There's no reason to spend weeks bikeshedding about this optional feature.

CURRENT PLAN

  1. wait a few more days for small fixes
  2. merge if there are no major objections
  3. see if issues arise during implementation and make changes as needed
  4. after widespread implementation, consider making this method mandatory

If you agree with this plan, please give a 👍, otherwise comment your objection

@lightclient
Copy link
Member

Please add spell check errors to wordlist.txt.

@kasey
Copy link

kasey commented Feb 2, 2024

How much do we want to enforce this?

If user specifies a graffiti taking all 32 bytes, client shouldn't start anymore?
Should we give an opt-out option to regain full 32 bytes?

This is absolutely not enforced. Most of the data will come from people who don't bother to set their graffiti. Users will always have the ability to choose whatever they want for their graffiti or to not provide this data at all. But some will want to set custom graffiti while also providing data for client diversity. For those users, we want to give them more bytes so they are more likely to participate.

How does everyone plan to represent this in flags? Are we going to artificially limit the size of user-specified graffiti flags? Otherwise, what would users expect the behavior to be if they specify 20-32 bytes of graffiti"? For instance do we truncate the version string from right to left? Doing that would assume precedence in importance of the information: CL impl > CL git hash > EL impl > EL git hash. It would also be hard for software parsing this field to differentiate user-specified graffiti that flows into this section and looks like an ident/hash from the real thing.

If this is required then IMO we should limit the size of user-provided graffiti to 20 bytes and be very explicit about the fact that 1) this will be a breaking config change for users and 2) there is no opting out by "overriding" the default with a flag.

@lightclient
Copy link
Member

One thing you could do is continue dropping client version data up until you can no longer store the EL+CL code combo. I think version / commit is nice to have but as critical.

@kasey
Copy link

kasey commented Feb 2, 2024

One thing you could do is continue dropping client version data up until you can no longer store the EL+CL code combo. I think version / commit is nice to have but as critical.

Yeah the EL is most important (hardest to determine through other means), then CL, then versions. So if the plan is to truncate, I think we should just order it that way: (EL|CL|el-hash|cl-hash). You could get fancy and interleave the EL/CL hash bytes one-by-one but maybe that's overkill :)

@ethDreamer
Copy link
Contributor Author

One thing you could do is continue dropping client version data up until you can no longer store the EL+CL code combo. I think version / commit is nice to have but as critical.

Exactly. This is what I've been referring to as a flexible standard. The version information is nice to have but not critical.

@rolfyone
Copy link

rolfyone commented Feb 4, 2024

I'm not sure I'd interleave, but shortening does make sense. If the version of a client is present, it being in the same logical chunk will be easier to search for. eg. searching for LH1b from the example below..

user graffiti takes up 0 characters: LH1be52536BU0f91a674 user graffiti takes up 20 characters: LH1be5BU0f91 user graffiti takes up 24 characters: LH1bBU0f user graffiti takes up 28 characters: LHBU

I think this flexible standard achieves all 3 goals (human readable, max space for the user, collision resistance). But anyone else is welcome to weigh in.

I think this is a sensible approach, and if the user graffiti is beyond 28 characters just not having the data...

Do we know what percentage of blocks have more than 28 bytes of graffiti? seems like we could get a fairly good estimation of how useful this would be...

@ethDreamer
Copy link
Contributor Author

I've compiled the discussion around choosing a graffiti standard into a single document:

https://hackmd.io/@wmoBhF17RAOH2NZ5bNXJVg/BJX2c9gja

I welcome any comments. Also, there haven't been any requests for changes in days. Seems like we can merge this?

@ethDreamer
Copy link
Contributor Author

@lightclient it's been about a week since people agreed we should merge this and there haven't been any objections. Is that enough time to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.