-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[spanmetricsprocessor] panic when using spanmetric with v0.56 #12644
Comments
Pinging code owners: @albertteoh |
@amoscatelli are you able to reproduce this error in 0.55.0 or before? |
@TylerHelmuth I can confirm 100% this issue is reproducible only with 0.56.0 |
It appears to be a race condition in
where the error I suspect this is because the metrics are computed in a separate goroutine from the trace "stream":
I'll see if I can reproduce the panic locally first to prove this hunch. |
Would making a deep copy of the trace data to use in aggregateMetrics solve this issue? |
I think so 😄 The following seems to reproduce the race condition via unit tests, with some hacks:
Applying the following fix passes the unit test:
|
Do we know of any workaround that would prevent this? |
A possible workaround (I have not tried this) is to not configure additional spanmetricsprocessor dimensions on top of the defaults, if it's acceptable to the use case. This is because the panic comes from trying to fetch a span's attributes to populate the additional custom dimensions. For example, in this issue, the spanmetrics config contains:
So the possible workaround (again, if acceptable to the use case) is:
Otherwise, a code change is the only alternative I can think of to address this issue. |
Thank you but, sadly, it's not a workaroud for us |
We tried it and it did workout for us, as this is still useful without any additional dimensions 👍🏻 ⭐ |
Describe the bug
using v0.56 and spanmetricsprocessor causes panic and otel collector contrib crash with using spanmetric processor and of course becames totally unusable and unresponsive
Steps to reproduce
dunno, probably just using spanmetric ?
for me it usually happens in a bunch of minutes after starting the server
What did you expect to see?
no crash, no panic, no errors
What did you see instead?
What version did you use?
otel collector contrib docker image v0.56
What config did you use?
Config: (e.g. the yaml config file)
Environment
AWS Elastic Beanstalk Linux v2
Additional context
This makes otel collector completely unusable
The text was updated successfully, but these errors were encountered: