Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additioinal information about API endpoint versions, data and metadata. #195

Merged
merged 4 commits into from
Mar 28, 2022

Conversation

mcdee
Copy link
Contributor

@mcdee mcdee commented Mar 1, 2022

As per #190 (comment) this provides some explanation about the difference between data and metadata, endpoint versions, and when an endpoint version should change.

These rules have been implicit, and as such poorly defined, until now. I have inferred the rules from actions we have carried out in the past, however as they are now being made explicit it would be good for someone from each client team to take a look at them and confirm they are happy with them.

@mpetrunic
Copy link
Collaborator

@paulhauner @rolfyone @dapplion @ajsutton @arnetheduck @prestonvanloon please approve if you are happy with this!

dapplion
dapplion previously approved these changes Mar 2, 2022
Copy link
Collaborator

@dapplion dapplion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 22 to 25
All JSON responses return the requested data under a "data" key in the top level of their response. Additional metadata may or
may not be present in other keys at the top level of the response. Metadata's presence or absence can be dependent on the type
and version of the beacon node, the state of the chain, and other conditions. Metadata is not versioned, and as such the same
version of an API endpoint may return different metadata at different times.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If metadata is used for anything and we allow removal or renaming metadata fields without a version bump then it would make for an unreliable API schema and probably introduce breaking changes.

Personally, I would like to see metadata be subject to the same versioning rules as data fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we should be able to add metadata fields, but not remove. Having to bump the version on every single change just breaks all clients unnecessarily.

Personally I'd apply the same policy to data - ignore unknown fields is the bedrock of internet compatibility - but it doesn't seem like I'm going to get my way on that...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer 'safe to add' type mantra, as otherwise we're going to potentially have a lot of versioning in our future...

If a client doesn't find a field they need, then they should handle it or fail depending on case, and if we do have to add something that makes things not compatible, then we could version at that point in time...

@ajsutton
Copy link
Contributor

ajsutton commented Mar 3, 2022

There's also the interesting case of events from the event stream. Currently the event stream endpoint has a version but the individual events don't. So any change to one event would require new version for all events which is unfortunate. We could add new events under a different name as a versioning strategy.

Being able to add fields to events without having to bump the version number would again minimise disruption for users.

@mpetrunic
Copy link
Collaborator

@mcdee Would you mind updating PR to reflect that removing/renaming metadata fields also require bumping endpoint version?

beacon-node-oapi.yaml Outdated Show resolved Hide resolved
beacon-node-oapi.yaml Outdated Show resolved Hide resolved
@ajsutton
Copy link
Contributor

ajsutton commented Mar 4, 2022

@mcdee Would you mind updating PR to reflect that removing/renaming metadata fields also require bumping endpoint version?

Given we've already added meta-data fields without bumping the version number (the dependent roots fields), it really doesn't make sense to me to now reverse course and not allow new fields there. The change from v1 to v2 for block APIs with Altair is still causing a fair bit of confusion for users so needing to change the version constantly is really not in our best interests or that of users.

@mpetrunic
Copy link
Collaborator

@mcdee Would you mind updating PR to reflect that removing/renaming metadata fields also require bumping endpoint version?

Given we've already added meta-data fields without bumping the version number (the dependent roots fields), it really doesn't make sense to me to now reverse course and not allow new fields there. The change from v1 to v2 for block APIs with Altair is still causing a fair bit of confusion for users so needing to change the version constantly is really not in our best interests or that of users.

I think adding fields without bumping endpoint is fine but renaming/removing is effectively breaking change and it should require version bump.

@rolfyone
Copy link
Collaborator

rolfyone commented Mar 4, 2022

I think adding fields without bumping endpoint is fine but renaming/removing is effectively breaking change and it should require version bump.

Yes breaking changes like changing a field or removing it would still need a version bump, but typically adding a new field should be compatible. The caveat is if the new field is going to render the rest of it useless (version field comes to mind as 'added' but also 'breaking')

@ajsutton
Copy link
Contributor

ajsutton commented Mar 7, 2022

Just discovered that for the /eth/v1/node/identity the additional syncnets field was added as part of Altair without bumping the version number even though that's in the data section.

And we designed the block and state apis so that they could return new milestones without bumping the version number which could include adding and/or removing fields, differentiated by the version field that shows the milestone for the block or state. With this policy for values under data we'd break that.

@rolfyone
Copy link
Collaborator

rolfyone commented Mar 7, 2022

We had previously decided that being additive wasn't a problem, so that is why some of the altair changes were made in the way that they were.
The block endpoint was changed because it was going to be more extensible having the version field to cover exactly the scenario I think we're talking about - new data types needing to be covered.
I'm not really sure generic 'rules' being documented is super helpful, because we tend to make a decision based on the best information at the time, and rules are made in the absence of the information, which means they'll just be broken.
The whole concept of having the 'version' field in metadata and the 'data' field is based on the fact that things change and we don't want to bump version numbers constantly for new endpoints because it can be very disruptive...

@mcdee
Copy link
Contributor Author

mcdee commented Mar 7, 2022

Okay so we seem to have two points of contention.

The first is if fields can be added to existing endpoints without bumping the version number. As we have already done this in the past without causing issues I think that it's reasonable to continue along these lines (changes or removals of existing fields will always warrant a version increment)

The second is if the same rules should apply to data and metadata. I would argue that they should not, because the original idea for metadata was to provide supplementary information that was useful but not required (e.g. version information for blocks), not supplied by all clients (e.g. pagination) and the like. Metadata gives providers the ability to enhance API results without it being part of the spec, which has a number of use cases. As such, I think that metadata should remain unversioned in all instances.

How can we come to agreement on these points?

@mcdee
Copy link
Contributor Author

mcdee commented Mar 7, 2022

I'm not really sure generic 'rules' being documented is super helpful, because we tend to make a decision based on the best information at the time, and rules are made in the absence of the information, which means they'll just be broken.

@rolfyone although they may constrain producers of the API, they significantly help consumers of the API and ultimately if it's a choice between being producer-friendly or consumer-friendly I'd want to aim for the latter. That said, I don't think that the rules are particularly onerous and more to help clarify to all what has happened to date (and, as far as I'm aware, found to work) as a guide to what may happen in the future. If a situation occurs when the rules has to change then fair enough, but I'd rather have the rules explicit and able to be changed in a documented fashion than undefined.

@rolfyone
Copy link
Collaborator

rolfyone commented Mar 7, 2022

The second is if the same rules should apply to data and metadata. I would argue that they should not, because the original idea for metadata was to provide supplementary information that was useful but not required (e.g. version information for blocks), not supplied by all clients (e.g. pagination) and the like. Metadata gives providers the ability to enhance API results without it being part of the spec, which has a number of use cases. As such, I think that metadata should remain unversioned in all instances.

I think this is what I was getting at in terms of "things aren't hard and fast"
If we are adding metadata fields that doesn't materially mean that functionality will break for clients, then that is OK in my opinion... If we are adding metadata fields that mean a client can't reasonably function without that context, then that implies that we MUST update the version number so that clients can pickup the change on the new api and implement the required changes...
I know this is a case-by-case kind of argument, but that's generally what people adding functionality to these kinds of API's need to be considering - if a client can't function without that new information, then its a 'breaking change', and the version number needs to be bumped...
The get identity change is a really good example of a non breaking change...

@arnetheduck
Copy link
Contributor

The first is if fields can be added to existing endpoints without bumping the version number. As we have already done this in the past without causing issues

I've seen several interim compatibility issues between clients because of issues like this, and because of other changes like adding and removing constants to .../config (or variations thereof where the client supports it but infura for example does not, yet). What happens is some user runs into it, reports it, it gets fixed and the next version of that client works, so developers following the latest versions tend not to see the bug and think it's all fine. Since it's happening between mainnet client teams, it's surely happening with the wider audience of general consumers of the API.

When it happens, users running older versions of clients start having mysterious and hard-to-track issues, and get confused - I find it's better not to force such users to upgrade outside of the hard-fork cycle, so it's better to agree ahead of time on a minimum policy, or at least a guideline, for how consumers should be programmed: this gives us, the producers of the spec, an easier target to shoot at when making upgrades because we can reason about how existing clients behave in terms of strictness.

Regarding case-by-case, I think these rules are a good start, though I would would consider them bare-minimum recommendations - it's also worth documenting within requests what the expectations are: for example, in any full block requests it's expected that if the user asks for a phase0 block, they get the exact fields of the phase0 block and not the additions in later forks with some bogus values - this is why we did v2 after all.


All JSON responses return the requested data under a `data` key in the top level of their response. Additional metadata may or
may not be present in other keys at the top level of the response. Metadata's presence or absence can be dependent on the type
and version of the beacon node, the state of the chain, and other conditions. Metadata is not versioned, and as such the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we can remove fields from metadata as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, I'd add here that for specific requests, more strict or relaxed rules may apply

@ajsutton
Copy link
Contributor

I've seen several interim compatibility issues between clients because of issues like this, and because of other changes like adding and removing constants to .../config (or variations thereof where the client supports it but infura for example does not, yet). What happens is some user runs into it, reports it, it gets fixed and the next version of that client works, so developers following the latest versions tend not to see the bug and think it's all fine. Since it's happening between mainnet client teams, it's surely happening with the wider audience of general consumers of the API.

The issues with the .../config API have definitely been painful and noticed with both latest and older versions as fields have been added, removed and renamed there because it's not particularly well defined what should be in that response and what definition it does provide is based on the spec which keeps changing in incompatible ways.

I don't think that's representative of the potential problems that have or would happen if fields were added without bumping the version number.

But I do think the .../config experience is a reason to not be too unrestricted with metadata fields. Personally I'd apply the same "add but not remove" approach to metadata as well.

@mcdee
Copy link
Contributor Author

mcdee commented Mar 14, 2022

Personally I think that having the same rules for data and metadata means that there is little point in separating the two, but I'm in the minority here so I have updated the PR accordingly.

@mpetrunic mpetrunic merged commit 4aeeb16 into ethereum:master Mar 28, 2022
@arnetheduck
Copy link
Contributor

With this approach to compatibility, one more addition is needed: whenever a field is added, it must be documented in the spec as a "later addition" or optional: the spec now says that fields may be added but not removed - it means that the validity of a message with a "missing" field is path-dependent: it depends on whether the field was added to the message at a later stage or was there from the beginning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants