-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metric advice to capture only a specific set of values for a given attribute, and normalize the rest to a value which represents "other" #3545
Comments
I actually think the opposite - you can correlate things in an even better way. Here are two examples: Example 1Correlation with exemplars - exemplars will give the exact value that we've originally observed, rather than something like "OTHER".
We could get metrics saying:
Now one could go to their logging system and find all the Example 2Correlation with logs and spans -
All these steps can be automated and eventually baked into the systems, and the semantic/intention is very clear. I think the key here is that we don't collapse logs/spans to artificially "align/correlate" with metrics - e.g. by having "OTHER" in all logs/traces/metrics. If we have strong reason to collapse the data (e.g. dimension capping for metrics to avoid cardinality explosion) - we should certainly do it, but not in a way that we force logs and traces to "align" by collapsing their data (I think once the data collapsed, the information loss is hard to be restored), instead, we should find better way to correlate them while maintaining low information loss. |
I think this is the same as what I tried to explain above ("this feels very complex and brittle"). I think the one complexity left out in your example is that different resources could have different lists of preserved values (not saying this is a great practice), which I think makes that approach a good bit more difficult. To be clear, I think a valid decision here would be to say that the benefit of simplicity (single attribute) outweighs this use case of generically and easily correlating capped metric attributes with span attributes, given that there are still other options for this kind of correlation:
|
So a few points:
The main benefit I see here, is if users really want high cardinality metrics, they could override the "hint" API with a view preserving the raw values. That would need to be a company-wide decision given point #2 above. I still prefer Option A, but wanted to add my thoughts on this one. |
More importantly, this pattern would apply to all semantic conventions, rather than having to deal with specific domains (e.g. Now it is HTTP method vs. "strict" canonical HTTP method (restricted to 9 values) vs. "loose" canonical HTTP method (restricted to 37 values)), later it will be SQL command vs. canonical SQL command. And as HTTP/SQL/etc. evolve, we'll keep inventing new field names and end up with a category of confusing names for anyone who doesn't understand the full history). |
this is irrelevant - we need a default set and configurability with both options. They are only different in configuration mechanism (where Option C/E is superior). |
This would be needed if we go with Limit HTTP request method cardinality: use one attribute (option C/E) as the solution for #3470.
The metric advice could take something like the following arguments:
string
)array of <value-type>
)<value-type>
)Open questions:
Should we have a dedicated "other" value?
For
string
attributes, the "other" value could be something obscure like__OTEL_CAPPED
.A downside to this is that it will not render as nicely as something like
GET/POST/OTHER
, but it would allow backends to reliably detect when an attribute value was capped (we could use something less obscure likeOTHER
but that seems like it could occur in instrumentation's natural habitat).For
numeric
attributes, I'm not sure what could make sense as a dedicated "other" value across all use cases. One known numeric attribute which could possibly benefit from this feature ishttp.response.status_code
.It may be ok to have a dedicated "other" value only for
string
attributes.How to address the correlation problem
The main argument against Limit HTTP request method cardinality: use one attribute (option C/E) is that you can't correlate
http.request.method
between metrics and spans for values that were capped (only) on the metric side.But if you know the values that were preserved, and you know the "other" value, then you could still do this correlation (even if maybe less efficiently):
Potentially a backend could figure out the list of preserved values (at least those which would have correlated with any spans), by looking at the complete set of
http.request.method
values that were not capped in the given time period (and with given resource and instrumentation scope name, since the set of preserved values could vary across resources and instrumentation libraries).But this feels very complex and brittle, so I'm not sure it's a great answer to the "correlation problem".
The text was updated successfully, but these errors were encountered: