-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify default metric view attributes in the API (hints?) #3061
Comments
I need help understanding your request. Can you provide a real-world example? |
Same.
Did you mean SDK? I could see how an implementation might be dropping recorded attributes before they make it to the aggregator, but that seems like a problem unique to Java (maybe others, but here Java) and not the spec itself. A View should be how a user can specify the attributes to keep, nothing in the spec exists to limit the attributes recorded by the instrumentation from getting to the filtering by the SDK. |
@mateuszrzeszutek do you think #1753 (comment) this makes sense? based on comment: #1753 (comment) |
I responded there to keep context. If I understand the comment you referenced better than I would suggest rephrasing the description to convey the intent as it is said there. |
@asafm I read the other comment. It is related to aggregations which yes does make sense to define at the source of meter. But the concern here is also about attributes here which has increasingly good fan following. And yes as we rightly stated this problem I guess is mostly pertaining to java instrumentation. In java since there is concept of automatic instrumentation. The instrumentation author declares a meter and records metrics. But since the cardinality of certain dimensions can be high, those needs to be filtered while authoring the instrumentation. An approach however can be to filter only high cardinality dimensions and let the others pass, but with every attribute user will add they will start popping up as metric dimension which is again not a good solution. Hence these attributes are filtered before recording the metrics but this poses another problem for the end user that if he/she wants to use some custom attributes then since we have already filtered to the list of attributes instrumentation author knows of it is not possible. The remaining things like aggregation etc is still changeable by the user by registering custom views but it is attributes which is currently not possible for the user to change since the attributes themselves got filtered before recording the metrics. I am not sure how this works for other languages. But if a library owner has instrumented using otel even for other languages, owner would include certain labels by default which again will lead to the above problem and I think this problem will happen for every language. So the idea was to provide a custom set of attributes by default that can be extended by the user. The aggregation types and other things can still be changed with the current view kind API. And also if you see since an instrument author can provide a defaults for histogram buckets, unit, attributes, if we see this is more or less like a view only. Hence the suggestion was to either use individual methods in meter builders to mention these defaults or to use a default view kind of a concept only |
Ok, I need help understanding your intent as I manged to pick up only fractions of the ideas.
I suppose you mean many people would like to solve the problem of attribute explosion per a given instrument. In that case, I want to paste a link here that discussed a circuit breaker idea - limiting, in general, the number of attributes / metric data points reported per instrument: #1891
Can you please explain why pertaining mostly to Java instrumentation and not other languages SDKs?
I want to make sure I understand this correctly. I believe automatic instrumentation was created to allow instrumenting of existing open-source libraries because they still need to adapt to OTel. If you are using, for example using, Apache Commons HTTP client, the OTel Java agent will instrument it - meaning there will be instruments created at run time, and the measurements reporting will be injected mainly through aspects (weaving byte code to include before and after code to report). If I'm correct in my description, I don't understand "the instrumentation author declares a meter and records metrics". When you say instrumentation author you mean the person the contributed that specific code that defines the aspects and adds the instruments for that library? I also don't understand "But since the cardinality of certain dimensions can be high, those needs to be filtered while authoring the instrumentation.". From my understanding, the author of a library knows the attributes they will use to report the measurements. The same applies to the person writing that Contrib library which instruments an existing library. The only thing they don't know is the cardinality of each attribute. The end-user utilizing those metrics needs some way to control that, right? So we can have the following: Automatic circuit breakers, removing offending high-cardinality attributes and issuing warnings. Users can self-correct by changing the view and removing that attribute, which will not make them lose data, only lose precision - since the SDK will roll up the measurements to the specified attributes.
But we need to distinguish the people who defined those instruments and would be well aware of the attributes they will use and hence their cardinality. If they notice it's too high, they can define to simply alter the code and not report them, no? I guess the only case I see that makes sense is that the instrument's author is someone who wrote it for their library. They report 5 attributes but decide that by default, only 2 will be defined; hence measurements are rolled up to those 2. If the end-user wishes to have more precision, so like in logging when they switch to debug mode, they can alter the attributes using views. I hope these clarifications will help me understand your intent better. |
Let me first explain a use-case and then I will explain with an example. So let us say I am a b2b business, and I have some 50 business which connects to me via an API call. Now if I just take Now the span can itself contain a lot of other atttirbutes like, user-agent, host address etc. But since these can be high cardinality dimension, the intrumentation author would like to put some defaults value to aggregate upon. Here he can follow 2 approaches:
Now let us say my use case is that I want to get this If the instrumentation author goes via Approach 1: If the instrumentation author goes via Approach 2: If the user mentions the same view name like instead of For making things more simple and efficient, if in Approach 2 itself we have this functionality where we can identify what is a default view that instrumentation author specified and the one that user specified, we can replace the view supplied by the author with the user view and everything will work seamlessly. We can extend this argument for other things like histogram buckets, aggregations etc. Now this might sound like an implementation problem for a particular language and how they treat similar views and partially it is true as well. But if there is a well defined distinction between the two, it will be language agnostic, where the author will define the defaults and the users will override the defaults. This can either be done in 2 ways:
This is also a good approach, something similar you have mentioned in the other comment. Also this is more synonymous of how micrometer and prometheus clients does. Approach 2:
Now if the user view is available we will use that instead of this defaultView. This is just reusing the existing concept, not sure how cleaner solution can this be. I hope the above helps. I see that some of the other languages also support automatic instrumentation. I haven't had a chance to work with them, but these concepts and problems may still apply there. |
Your examples don't seem to address this concern. Was this just a side note or are the approaches mentioned supposed to be related? |
Yes this is related so when we have auto instrumentation, we don't have a direct access to the spans created by the auto instrumentation agent. Hence to add custom attributes in spans generated by the auto instrumentation, the only approach as far as I know is to use a span processor which gets all user spans and the auto instrumented spans. From my experience of using open telemetry and identifying multiple issues in multiple repositories, I can identify that there are few important limitations which a lot of people are facing today. As I stated in the above example that I would like to add this The above thing might be unrelated to this issue, I was just highlighting how a lot of individuals see these drop in instrumentation agents and what they would want to get out of them. |
Ok, I read through this twice. As I was reading it, I had many questions, so I try to describe the problem based on your description. There's a good chance I might have made some mistakes since I had many unanswered questions from your description. A person is using OTel Java SDK. This person uses, in our example, an HTTP client. This person also uses OTel Java Auto Instrumentation, and this library now has metrics emitted for it. The instrumentation author, in our example, created an instrument named // response sent code
httpServerDuration.record(httpRequestDuration, httpResponse.code(). httpRequest.method()) This person's usage of this HTTP server is located in our example in some micro-service, and this service receives requests which certain business contexts - in our example, a customer ID. During the request processing, the shared context in OTel Java SDK is filled by the business logic code with the The person's goal is to have that customerId used as an additional attribute when If I understand correctly, the default behavior when using the default view is to ignore any attribute recording in the shared context object. .registerView(
InstrumentSelector.builder()
.setName("http.server.*")
.build(),
addBaggageAttribute(View.builder(), "customerId")
.build()
// helper method to complete code ergonomic
private static ViewBuilder addBaggageAttribute(ViewBuilder viewBuilder, String attributeName) {
SdkMeterProviderUtil.appendFilteredBaggageAttributes(viewBuilder, baggageKeyName -> baggageKeyName.equals(attributeName));
return viewBuilder;
} That view, upon record using attributes Do you think that would accomplish your goal? If not, I'll try to address and write all my questions about your comment. |
@asafm you have rightly understood the problem, let me just try and go through this example and revert back with my doubts if any. |
@asafm I checked that this is working, I guess this is not well documented and hence I and lot of other folks have missed this, as a result there are so many issues and discussions open to address this. I checked the commit history of the specs earlier and somewhere I saw this was documented earlier but later removed. @mateuszrzeszutek this should solve the problem of attributes. @asafm but still I feel there is still a gap in specifying some metric related defaults especially for histogram buckets. For example Currently by default the histogram bucket used here is the default bucket of histogram but these are 2 different metrics with very wide range of difference in their values. It will be better if in SemanticConvention we have some means of providing standard bucket sizes and let it be used in the auto instrumentation libraries. Users today have the option to change the buckets using custom views, but not all users will make try to make a custom extension repo, publish their own version of the auto instrumentation jar and use it. Safe defaults for them will be very helpful. |
I am not a maintainer, so I'm writing what I think:
In general OTel's biggest caveat, today is its documentation, but it's also the most easily fixed caveat.
That's why there is this issue: #2229
I was wondering the same thing. I advise you to open an issue related to adding buckets and perhaps units to semantic conventions. |
Let me recap what I think this issue is asking for:
I think the hint API solves #1 and #2. Looking at @anuragagarwal561994's comments I'm not sure #3 is fully solved today. For the purpose of this issue, I'd like to just focus on solving #1 + #2 and we can open new feature requests for #3 as we recognize deficiencies. |
@jsuereth 3rd point we can actually solve as suggested by @asafm by using a baggage and creating a custom view to include attributes from baggage but as stated this is not clearly documented hence I could see few issues addressing the same problem. A view is not mergeable, if there are 2 views existing both will start recording. But overall from a user perspective we can state something as follows:
The 1st problem exists even today and I guess in specification itself we can't mention buckets in histogram, it is only applicable via views and I said above views are not mergeable thus it becomes difficult to mention defaults by the instrumentation author. 2nd, users can change the bucket via the views API 3rd, users can use the views API to delete something. Users can use baggage API and view combination to their own attributes but in Java library if something like |
For solving the 1st problem, since merging / replacing or choosing a view is another complicated problem to solve. We can include default histogram buckets separate from views API and override if a view is present. |
@anuragagarwal561994 The "hint" API is meant to be a "default view" for any instrument that the instrumentation can provide. If the user provides one it overrides it. I'm not sure we need to invent different mechanisms than the one we were planning to add. Hopefully this solves the first issue you list. Effectively, in instrumentation, you'd have something like: meter.histogramBuilder("http.latency)
....other things
.hints(defaultViewBuilder ->
defaultViewBuilder
.setAggregation(....)
); I was putting together a proposal API for this where effectively any instrument can take a "view" configuration as a "hint". Naming - I'm bad at that, but likely something like ".hintDefaultView" or ".defaultViewHint" some such, to denote that it can be overriden by the SDK. cc @reyang to make sure this aligns with his thoughts on hint API. |
+100 with everything @jsuereth said in #3061 (comment) Some additional thinking
In this way if the user decided to configure a different aggregation type (e.g. base-2 exponential bucket histogram), they can still benefit from the "default/recommended dimensions". |
@jsuereth my original suggestion was as well to build it using view API hence I started code changes and discussions on open-telemetry/opentelemetry-java-instrumentation#7474 But sooner diving deep in the implementation of view API I realised and as I have stated earlier that it is not mergeable. Hence if two views are used for the same meter then 2 times it will be recorded. I am assuming the hint API would be like either use the author specified default view (provided by hint API) or use user specified view where user explicitly mentions everything like attributes, buckets etc. if overriding is needed. I think merging can be possible in this case just between the hint API and the user provided view, if given more thought. But @reyang I guess for mentioning the above things and using the View API itself for hinting, one might not require list of Views, one view can cover all the three requirements. |
Not sure if I understand, I think View is only provided by the SDK and is intended to be used by the application developers; Hint is provided by the API is intended to be used by library developers (e.g. instrumentation library or instrumented library). For each instrument, View should be specified only once (or maybe the last one wins if certain implementations chose to support multiple). |
I think an example or an initial design proposal would help us ideate and discuss more on how the hints API will look like and what problems it will solve. |
Is there an issue covering the Hints API so we can discuss this in context, as this issue was only focused on setting default attributes, and #2229 is focused on setting default aggregation and its configuration?
@reyang can you please expand on the vendors adding hints? How did you envision it? Also, I am still trying to figure out why these are called hints. Is there a possibility the instrument author specified Explicit Bucket aggregation and the Metric Reader would just ignore it and decide on the aggregation by itself (say the user didn't specify views)? I looked at it as setting defaults for any view created, hence meter.histogramBuilder("http.latency)
//....other things
.setViewDefaults(ViewDefaultsBuilder.builder()
.setAttributes(...)
// Example 1
.setAggregation(ExplicitBucketAggregation.builder()
.setBuckets(10, 20, 300)
.build())
// Example 2
.setAggregation(MyCustomSummaryAggregation.builder()))
) In my opinion, we should use a typed way of specifying those defaults, as users will be confused if they need to go online to find out how to define buckets.
We can use a type system that will also support adding custom aggregations. Library owners won't need ever use that, so only to be used when defining a view in the SDK or creating the instrument in their code. @anuragagarwal561994 - regarding what you wrote about the ability to merge views. The way I see it, the new methods we will add will allow setting the defaults for a view. If the user explicitly created a view setting the attributes, it will override the default attributes set by the instrument author. Yet, if the user only sets the aggregation, they will get the rest of the defaults set on their view: attributes in this example. I think the defaults should be constrained to what makes sense: aggregation and its configuration and attributes. You don't need to specify a name or a unit because the instrument API already covers it. @reyang you wrote
What did you mean by "Use histogram be default"? Aren't they doing it by writing |
Fixes #2229. Related to #3061 (lays groundwork but does not resolve). Related to #2977, which may use this new API to have `http.server.duration` report in seconds instead of ms without changing / breaking default bucket boundaries. Summary of the change: - Proposes a new parameter to optionally include when creating instruments, called "advice". - For the moment, advice only has one parameter for specifying the bucket boundaries of explicit bucket histogram. - Advice can be expanded with additional parameters in the future (e.g. default attributes to retain). The parameters may be general (aka applicable to all instruments) or specific to a particular instrument kind, like bucket boundaries. - Advice parameters can influence the [default aggregation](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#default-aggregation), which is used if there is no matching view and if the reader does not specify a preferred aggregation. - Not clear that all advice will be oriented towards configuring aggregation, so I've intentionally left the scope of what they can influence open ended. I've prototyped this in java [here](open-telemetry/opentelemetry-java#5217). Example usage: ``` DoubleHistogram doubleHistogram = meterProvider .get("meter") .histogramBuilder("histogram") .setUnit("foo") .setDescription("bar") .setAdvice( advice -> advice.setBoundaries(Arrays.asList(10.0, 20.0, 30.0))) .build(); ``` Advice could easily be changed to "hint" with everything else being equal. I thought "advice" clearly described what we're trying to accomplish, which is advice / recommend the implementation in providing useful output with minimal configuration. --------- Co-authored-by: Reiley Yang <[email protected]>
Hey, Now that we have the advice API for histograms (and also because there were some people asking for this recently in the Java instrumentation SIG) I wanted to bump this issue and clarify the original ask a bit. I'd like to add an "attributes" advice to every instrument that specifies which of the observed attributes should be kept in a metrics observation. E.g. I'd like to define the duration = meter
.histogramBuilder("http.client.duration")
.setUnit("s")
.setDescription("The duration of the outbound HTTP request")
.setAdvice(advice ->
advice
// default buckets as defined in semconv
.setExplicitBucketBoundaries(List.of(0.0, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1.0, 2.5, 5.0, 7.5, 10.0))
// default attributes as defined in semconv
.setAttributes(Set.of(
"http.request.method",
"http.response.status_code",
"network.protocol.name",
"network.protocol.version",
"server.address",
"server.port",
"server.socket.address"))
.build(); In the instrumentation code, we'd pass the same The View API seems to be somewhat ready for this addition, it even explicitly mentions the "hint API":
I could work on a PR if there's no opposition towards this feature. |
This is a great addition to advice, IMO. Its actually simpler that the explicit bucket boundaries parameter: Explicit bucket boundaries can be defined by advice, by metrics readers (via aggregation preference), or by views. Attribute keys would only be definable via advice and views. There's no metric reader preference to contend with. |
Just adding a +1 on this issue from the .NET team to show we expect it to have value outside of Java. Imagine we have a library shipped in .NET 8 that contains a new metric. Sometime after that library ships users give us feedback that adding a "Foo" attribute on this metric would be really useful. So naively we add the new Foo attribute to the instrument, ship it in .NET 9 and now a different set of users show up very angry. As soon as they updated to .NET 9 and got the first bill from their telemetry backend they owe a lot of money because there was 100x growth in the amount of data being saved when the new attribute sub-divided all the data. Another set of users show up angry because even though their bill only increased a little, they hit built-in limits on data upload or number of timeseries which caused other important telemetry data to be lost. For now the only way I can see libraries adding new attributes and maintaining a high compat bar is if they add new library-specific public APIs that allow library users to explicitly opt into each new attribute. While I expect it would work, it feels like it diminishes the value of having a View API if library users are required to configure attributes using library-specific APIs instead. If library authors could add the new "Foo" attribute but mark it disabled by default, that preserves compatibility for existing library users while allowing those who want it to opt-in with a View. |
…telemetry#3216) Fixes open-telemetry#2229. Related to open-telemetry#3061 (lays groundwork but does not resolve). Related to open-telemetry#2977, which may use this new API to have `http.server.duration` report in seconds instead of ms without changing / breaking default bucket boundaries. Summary of the change: - Proposes a new parameter to optionally include when creating instruments, called "advice". - For the moment, advice only has one parameter for specifying the bucket boundaries of explicit bucket histogram. - Advice can be expanded with additional parameters in the future (e.g. default attributes to retain). The parameters may be general (aka applicable to all instruments) or specific to a particular instrument kind, like bucket boundaries. - Advice parameters can influence the [default aggregation](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#default-aggregation), which is used if there is no matching view and if the reader does not specify a preferred aggregation. - Not clear that all advice will be oriented towards configuring aggregation, so I've intentionally left the scope of what they can influence open ended. I've prototyped this in java [here](open-telemetry/opentelemetry-java#5217). Example usage: ``` DoubleHistogram doubleHistogram = meterProvider .get("meter") .histogramBuilder("histogram") .setUnit("foo") .setDescription("bar") .setAdvice( advice -> advice.setBoundaries(Arrays.asList(10.0, 20.0, 30.0))) .build(); ``` Advice could easily be changed to "hint" with everything else being equal. I thought "advice" clearly described what we're trying to accomplish, which is advice / recommend the implementation in providing useful output with minimal configuration. --------- Co-authored-by: Reiley Yang <[email protected]>
What are you trying to achieve?
As an instrumentation author, I want to be able to be able to pass all the attributes that I have on hand and specify which ones should be used in the default view; the user should be able to override these with a view.
In the Java instrumentation repo we have added several "temporary view" classes that limit the attributes to the ones that are defined in the metric spec; for example, we remove the
http.target
attribute from HTTP server metrics because it has too high cardinality (even though it is available at the moment).Unfortunately, this makes it really hard (if not impossible) for our users to add custom attributes to the metrics created by our instrumentations - any custom stuff is discarded before it hits the API.
I think this is somewhat similar to the histogram hint API issue that I created a while ago (#2229) - I want to use the API to provide some hints for the SDK, so that it can incorporate them in the default view. Any user-provided view would override these.
Additional context.
The text was updated successfully, but these errors were encountered: