Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Resource schema translation #3944

Closed
wants to merge 18 commits into from

Conversation

MrAlias
Copy link
Contributor

@MrAlias MrAlias commented Mar 28, 2023

Add resource schema translation via a WithSchemaURL method to the Resource or the MergeAt function.

Related to ...

As stated in the merge section of the specification:

The resulting resource will have the Schema URL calculated as follows:

  • If the old resource's Schema URL is empty then the resulting resource's Schema
    URL will be set to the Schema URL of the updating resource,
  • Else if the updating resource's Schema URL is empty then the resulting
    resource's Schema URL will be set to the Schema URL of the old resource,
  • Else if the Schema URLs of the old and updating resources are the same then
    that will be the Schema URL of the resulting resource,
  • Else this is a merging error (this is the case when the Schema URL of the old
    and updating resources are not empty and are different). The resulting resource is
    undefined, and its contents are implementation-specific.

Given we still support Merge which follows this behavior and MergeAt is additional functionality, it seems like we wouldn't violate the specification. That said, it may be the case the specification needs to be updated before this is added.

@MrAlias MrAlias added area:resources Part of OpenTelemetry resources proposal labels Mar 28, 2023
@MrAlias MrAlias changed the title PoC: Resource schema translation [Proposal] Resource schema translation Mar 29, 2023
@MrAlias MrAlias force-pushed the resource-schema-translate branch from 18ee2b2 to a8bc6f8 Compare March 29, 2023 20:47
@MrAlias MrAlias marked this pull request as ready for review March 30, 2023 16:23
Copy link
Member

@Aneurysm9 Aneurysm9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Should examples be updated to demonstrate this usage?

@MrAlias
Copy link
Contributor Author

MrAlias commented Mar 30, 2023

LGTM! Should examples be updated to demonstrate this usage?

SGTM 👍

@codecov
Copy link

codecov bot commented Mar 30, 2023

Codecov Report

Merging #3944 (5174518) into main (1b97d78) will decrease coverage by 0.5%.
The diff coverage is 57.9%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #3944     +/-   ##
=======================================
- Coverage   82.1%   81.7%   -0.5%     
=======================================
  Files        174     177      +3     
  Lines      12975   13196    +221     
=======================================
+ Hits       10655   10783    +128     
- Misses      2100    2183     +83     
- Partials     220     230     +10     
Impacted Files Coverage Δ
sdk/resource/resource.go 53.1% <0.0%> (-23.4%) ⬇️
sdk/resource/internal/schema/registry.go 55.8% <55.8%> (ø)
sdk/resource/internal/schema/compare.go 71.4% <71.4%> (ø)
sdk/resource/internal/schema/convert.go 88.1% <88.1%> (ø)

@MadVikingGod
Copy link
Contributor

I have no objections to the underlying mechanics, but I do have some questions that might affect this PR:

  1. With the feedback from the maintainer's meeting, should we return a merged resource and error when the URLs don't match?
  2. Considering that use of WithSchemaURL and MergeAt both cause network fetches, would the contrib repo be a better home for this? I think that adding an API that does arbitrary network requests to a tool that had none before is risky.
  3. Do you see a path where the application developer (or library author or both) can precache the schemas for restricted environments? This doesn't need to be solved here but is often an ask when network resources are fetched like this.

@MrAlias
Copy link
Contributor Author

MrAlias commented Apr 4, 2023

  1. With the feedback from the maintainer's meeting, should we return a merged resource and error when the URLs don't match?

Like should we continue to provide Merge?

  1. Considering that use of WithSchemaURL and MergeAt both cause network fetches, would the contrib repo be a better home for this? I think that adding an API that does arbitrary network requests to a tool that had none before is risky.

Having to require users to import another package from contrib just so they can merge resources from this package does not seem appropriate. We have added clear documentation that these will make external requests. I see that as sufficient to inform users of the added functions behavior.

  1. Do you see a path where the application developer (or library author or both) can precache the schemas for restricted environments? This doesn't need to be solved here but is often an ask when network resources are fetched like this.

I can see an environment that want this caching it over their network. I.e. they have their DNS server point to a local HTTP cache they run.

@jufemaiz
Copy link

jufemaiz commented Apr 5, 2023

I have no objections to the underlying mechanics, but I do have some questions that might affect this PR:

  1. With the feedback from the maintainer's meeting, should we return a merged resource and error when the URLs don't match?
  2. Considering that use of WithSchemaURL and MergeAt both cause network fetches, would the contrib repo be a better home for this? I think that adding an API that does arbitrary network requests to a tool that had none before is risky.
  3. Do you see a path where the application developer (or library author or both) can precache the schemas for restricted environments? This doesn't need to be solved here but is often an ask when network resources are fetched like this.

Some thoughts from someone grappling with otel implementation and management atm.

  1. It's definitely interesting. At least with a defined error it could be mitigated / managed more effectively while still allowing for reporting. Would the returned resource have no schema URL set as a result of the conflict? (That is, the key-values wouldn't be bound to any schema at all, and it would be at the behest of the service to manage the subsequent use/migration).

  2. The network fetch is interesting and not something that I considered in my originating thread (can you point me to where that occurs?), but does seem to highlight even more that there's some serious issues with the current opentelemetry-go hard coding schema URLs into the responses they're providing (and to be clear, I understand why that is in order to return a resource etc, but it is an issue as it sets a fixed schema incompatible with any others). This also implies that the existing implementation already undertakes network requests. I would assume then that the opentelemetry-go would then seek to remove explicit uses of any given schema in returning resources? Examples would include:

    As noted, the need to import a separate package to unwind embedded schema URL definitions of the core package because it, or others, do not match the desired schema URL of a service seems at odds with the goals of the package.

  3. As per question in (2), does the current opentelemetry-go package precache the v1.17.0 schema?

@MadVikingGod
Copy link
Contributor

  1. With the feedback from the maintainer's meeting, should we return a merged resource and error when the URLs don't match?

Like should we continue to provide Merge?

Yes, but do Merge like other languages where it returns a merged resource and an error. This would allow users to merge without having to change spec levels, and in cases where that wouldn't help, like merging with a custom URL.

  1. Considering that use of WithSchemaURL and MergeAt both cause network fetches, would the contrib repo be a better home for this? I think that adding an API that does arbitrary network requests to a tool that had none before is risky.

Having to require users to import another package from contrib just so they can merge resources from this package does not seem appropriate. We have added clear documentation that these will make external requests. I see that as sufficient to inform users of the added functions behavior.

Requiring a network request to do merges also does not seem appropriate. If this logic was upgrade/downgrade by passing in an already parsed Schema then I have no objection, but mandating a network lookup when this information can be available in other ways goes too far. Couldn't these schemas be already compiled into the semver, and available in binaries at startup without the network?


The network fetch is interesting and not something that I considered in my originating thread (can you point me to where that occurs?)

The Registry.Get() makes a call to the URL. This is called by WithSchemaURL if there is an upgrade or downgrade, and MergeAt if there is any difference between SchemaURLs. This is only in the new code.

As per question in (2), does the current opentelemetry-go package precache the v1.17.0 schema?

No, there isn't anything that I know of that uses the schema, this would be the first.

@MrAlias
Copy link
Contributor Author

MrAlias commented Apr 6, 2023

Yes, but do Merge like other languages where it returns a merged resource and an error. This would allow users to merge without having to change spec levels, and in cases where that wouldn't help, like merging with a custom URL.

I don't see a reason to change our behavior for Merge. The specification is clear the behavior is undefined, and there is not consistency among language implementations to follow.

Requiring a network request to do merges also does not seem appropriate. If this logic was upgrade/downgrade by passing in an already parsed Schema then I have no objection, but mandating a network lookup when this information can be available in other ways goes too far. Couldn't these schemas be already compiled into the semver, and available in binaries at startup without the network?

We could cache the OTel schemas in code, but it would become stale if there are fixes pushed to their source.

How would 3rd party schemas be handled without a network request to fetch them?

@jufemaiz
Copy link

@MrAlias + @MadVikingGod I'm still confused as to the requirement here for a schema URL HTTP request. My understanding is that this go package, at the point of publication, has the schemas embedded within the package, so as to ensure that it can be used. Hence the references I've raised above about hard coding schemas to tagged releases of this package. Why wouldn't that be the approach adopted?

Further, with semver, I'm not sure I understand the following assertion:

…it would become stale if there are fixes pushed to their source…

Is this a reference to new schemas rather than fixes to existing schemas?

@MrAlias
Copy link
Contributor Author

MrAlias commented May 4, 2023

SIG feedback: The spec should be updated to require some sort of translation like this.

@MrAlias
Copy link
Contributor Author

MrAlias commented May 4, 2023

Might want to include a digest of downloaded schemas

@MrAlias
Copy link
Contributor Author

MrAlias commented May 4, 2023

Might want to include a digest of downloaded schemas

Can the specification provide these?

@MrAlias
Copy link
Contributor Author

MrAlias commented May 13, 2023

Closing in favor of https://github.com/MrAlias/otel-schema-utils

@MrAlias MrAlias closed this May 13, 2023
@MrAlias MrAlias deleted the resource-schema-translate branch May 13, 2023 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:resources Part of OpenTelemetry resources proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants