Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource.Default()/etc is a footgun #3769

Closed
ash2k opened this issue Feb 24, 2023 · 5 comments
Closed

resource.Default()/etc is a footgun #3769

ash2k opened this issue Feb 24, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@ash2k
Copy link
Contributor

ash2k commented Feb 24, 2023

Problem Statement

r, _ := resource.Merge(
resource.Default(),
resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceName("fib"),
semconv.ServiceVersion("v0.1.0"),
attribute.String("environment", "demo"),
),
)

Code above gives an example of how to use resource.Default(). There is a problem with this code though - if you update otel dependencies you may get a cannot merge resource due to conflicting Schema URL error when running the code. This happens if otel SDK starts using a different version of the schema to what it used to use and what is used in the example/user code. I.e. schemas used to match but do not after SDK change. From user's perspective it all worked, but suddenly starts failing at runtime.

This is especially problematic if resource merging is conditional e.g. when it's only done when tracing is enabled. I've just had a situation where I bumped the otel dependencies, all tests passed, I cut a release and then it blew up when deployed to a canary environment where tracing is actually enabled.

It's not only resource.Default(), a few other methods in the resource package have the same issue.

Proposed Solution

Not sure. Approaches that are better than failing at runtime:

  • Merging "just works" (somehow).
  • Code fails to compile (somehow).

Alternatives

Prior Art

Additional Context

@ash2k ash2k added the enhancement New feature or request label Feb 24, 2023
@ash2k
Copy link
Contributor Author

ash2k commented Feb 24, 2023

Just added a unit test to my codebase and it fails:

import (
	"go.opentelemetry.io/otel/sdk/resource"
	semconv "go.opentelemetry.io/otel/semconv/v1.12.0"
)

func constructResource() (*resource.Resource, error) {
	return resource.Merge(
		resource.Default(),
		resource.NewWithAttributes(
			semconv.SchemaURL,
			semconv.ServiceNameKey.String("some name"),
			semconv.ServiceVersionKey.String("some ver"),
		),
	)
}

func TestConstructResource(t *testing.T) {
	_, err := constructResource()
	require.NoError(t, err)
}
        	Error:      	Received unexpected error:
        	            	cannot merge resource due to conflicting Schema URL
        	Test:       	TestConstructResource
--- FAIL: TestConstructResource (0.00s)

Fix is to change import to semconv "go.opentelemetry.io/otel/semconv/v1.17.0". This is not a good UX :)

@ash2k
Copy link
Contributor Author

ash2k commented Feb 24, 2023

Perhaps those methods can be moved into the "versioned" packages? They can use the same underlying types, but a matching schema URL from that particular package. In the future more attributes can be added to new scema versions and having e.g. Default() in a versioned package expose that attribute only for a certain version would make sense. Put differently, today I cannot use those helper methods with older schema versions.

@Aneurysm9
Copy link
Member

resource.Merge() does not have the schema translation capabilities it needs to make this work as intended. The schema module provides the schema parsing capabilities required to implement this. There is a collector schemaprocessor that is under construction that can demonstrate the use of that module. That said, implementing translation in resource.Merge() may be in violation of the resource specification.

@MrAlias
Copy link
Contributor

MrAlias commented May 13, 2023

Here's a new repo I created to handle schema translations: https://github.com/MrAlias/otel-schema-utils

Only thing supported currently is resource translations, but out of this it does provide a resource merge functionality that is schema URL aware: https://pkg.go.dev/github.com/MrAlias/[email protected]/schema#Converter.MergeResources

JamieDanielson pushed a commit to honeycombio/otel-config-go that referenced this issue Aug 14, 2023
## Which problem is this PR solving?

When using detectors with mis-matched Schema URLs, resource.New will
return an error. Previously, this was dropped and consequently it looked
like configuration worked but many resource attributes would be missing
in the resulting telemetry.

With the recent addition of
#48, this error is
much easier to hit. For example, the detectors built into
opentelemetry-go
[import](https://github.com/open-telemetry/opentelemetry-go/blob/main/sdk/resource/builtin.go#L25)
semconv `v1.17.0` but this project [pulls
in](https://github.com/honeycombio/otel-config-go/blob/main/otelconfig/otelconfig.go#L21)
`v1.18.0`. This means that adding something like
`otelconfig.WithResourceOption(resource.WithHost())` results in a
mis-configured OTel and no error to warn you what happened.

This is all … very bad. This PR is really the bare minimum that needs to
be done here. That is, it at least causes a runtime error so that you
can tell what's going on — but it doesn't solve the problem that you
fundamentally cannot use this library with _any_ other library that
doesn't happen to use semconv `v1.18.0`. There are [a
number](open-telemetry/opentelemetry-go#2341)
of
[open](open-telemetry/opentelemetry-go#3769)
[issues](open-telemetry/opentelemetry-go-contrib#3657)
in OTel and adjacent repositories regarding this problem, which probably
warrant further investigation into a long-term sustainable solution.

## Short description of the changes

Return the `err` from `resource.New` up through the stack. Update tests
to expect this.

⚠️ I guess technically this could be considered a backwards incompatible
change, in that folks may have configurations that are "working" today
but will throw an error after this change. I put that in scare quotes
though because it's not _really_ working — it's just not throwing an
error. I feel like actually returning an appropriate error here and
letting folks know their configuration is likely not working as expected
is the right choice.

## How to verify that this has the expected result

There's a new test that specifically uses a detector with a different
schema URL.

Co-authored-by: Vera Reynolds <[email protected]>
@pellared
Copy link
Member

pellared commented Nov 2, 2023

Closing as duplicate of #2341

@pellared pellared closed this as not planned Won't fix, can't repro, duplicate, stale Nov 2, 2023
aarongable pushed a commit to letsencrypt/boulder that referenced this issue Sep 24, 2024
OpenTelemetry has "semantic conventions" which are versioned
independently of the software package, as it describes the semantics of
the resources being produced. Previously, we'd combined
`resource.Default()` using the `Merge` function with our own resources.

Merge, however, doesn't handle merging resources with different semantic
conventions. This means that every dependabot PR that bumps otel will
break when the `resources.Default` has a new version.

That doesn't seem worth it for the default resources, so just provide
our own resources which have everything we care about. I've added the
PID which we didn't have before but will be interesting. We will lose
the SDK's version, but I don't think that matters.

For more discussion on this topic, see
open-telemetry/opentelemetry-go#3769
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
4 participants