-
Notifications
You must be signed in to change notification settings - Fork 178
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
active record: many spans created for internal operations #35
Comments
In the example above, it seems like perhaps there's some As far as the behavior around
This is an incredibly similar pattern to what we do in As far as a more general "changing this behavior to only include functions invoked directly by users.", I'm not against having a configuration option for something to this effect, but as i mentioned above, practically speaking i'm not sure how it would be implemented in a robust way. |
Thanks eric,
I agree that it's losing value. If taken to the extreme, we can imagine theoretical instrumentation of each function invocation in the application which will give an amazing performance profile but introduce a big cost and traces that will be harder to visualize. I think that noisy instrumentation should offer some "slim" or "non-verbose" mode which users can hop in to reduce their cost and visual load in the UI (which is important to some of our users).
Regarding instrumenting only top-level API calls, this can be achieved by adding a custom key into the context, and skipping instrumentation if this key is already present. we implemented this approch for nodeJS typeorm instrumentation and it's working good so far. I believe this can also be implemented in ruby if I'm not missing anything.
I imagined only instrumenting the top-level Thanks for taking the time to read this issue and respond |
@blumamir that sounds novel and reasonable to me. I think that's a permissible use of context according to a quick reading of the specification. I'd been keen to hear if other folks have concerns about this approach, @open-telemetry/ruby-approvers , if not i'm happy to assign this issue to you or your colleagues, and can happily review a PR. For posterity (and because i feel like writing excessively long messages 😄), a path I'd prefer not to go down with instrumentation is something you mentioned:
While I agree that certain instrumentations should be designed with different modes of verbosity (beside sidekiq, another example would be web frameworks which have excessively high numbers of low-value chained middleware spans), I think it's dangerous to begin to write instrumentation that is overly worried about "their cost and visual load in the UI", since UIs are explicitly not a concern of OpenTelemetry, afaik. Personally speaking that mindset is little silly to me because obviously instrumentation doesn't exist in isolation, it exists as part of an observability system, but this is a choice OpenTelemetry has been intentional about and I'm trying to approach these problems as a representative of OpenTelemetry. I'm not sure what the right mental model is when writing instrumentation for opentelemetry, maybe whether the amount of data generated (high volume of traces, high volumes of spans in a trace, and traces which continue ad infinitum, etc) has value exceeded by the cost of having to transform that data into a wire format and send it across the wire. Maybe just "as long as observer-effect isn't noticeable". I'm not sure. Anyway, tl;dr we're doing similar-ish stuff in other instrumentations already, and if it unblocks end users i'm in favor of it (as an off by default config option). If there are practical issues voiced by others with this approach instead of just theoretical ones, I'm open to hearing them (a part of me wonders if this isn't something a sampler should be handling and not the instrumentation itself), but imo i think this can move to implementation at this point. |
I think it's totally reasonable to use context in this way to suppress noisy instrumentation invoked recursively by the framework that's instrumented. From a support perspective, I think it's useful to default to suppressing the nested AR spans, since noisy traces are generally only useful for debugging a problem once it has been identified, and are generally just annoying (or worse) if things are 👌 . This suppression should be configured by an option on the AR instrumentation and should only affect the AR instrumentation. That means that if you have DB driver instrumentation enabled, its spans will still be present as direct children of the top-level AR span instead of under the now-elided nested AR spans. If the DB spans are noisy at that point, I'd recommend either a custom sampler or removing the DB instrumentation. |
Great iterating over this with you guys, like the brainstorming :)
Interesting 🤔 I wonder if we can find a solution for AR that better fits OpenTelemetry, is more generic and requires less code changes. Is it reasonable to use a custom sampler for that? |
@blumamir I would imagine a custom sampler could handle this (though i'm not 100% sure, i think you'd need to maintain the state that a context key otherwise would somewhere), or alternatively something like the approach you outlined above. Without diving into the code it's hard for me to say for sure, ActiveRecord is probably a bit more complex than whatever ORM this is being modeled after and there are probably some edge cases we're not considering. I think this is good discussion but at this point I'd encourage you to spike on an implementation that you think represents a good start to this work, and push up a PR or branch (can be a draft), and we can discuss this more concretely and work toward getting this behavior supported. If you dont have bandwidth let me know and i can see if i can carve out time. |
Thanks very much @ericmustin At first I planned on adding configuration per noisy instrumentation in order to reduce verbosity. This is what we've done so far in node's instrumentations. I'm working on other features in parallel but will update here once I have news. In the meantime, any thoughts and concerns are welcome on this thread |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Description of the bug
This is not actually a bug. I investigate heavy traces with many many active record spans and found the following:
create
,save
,transaction
.create
is invoked, active record checks if the argument is an array, and if so, it invokescreate
for each element in the array:create
invokes thesave
function on each object, which is also instrumentedtransaction
method which is also instrumented. Could not reproduce it myself but I can see it generated in a some other environment.For example, the following function invocation:
Creates the following output in Jaeger:
I am thinking about whether it gives good value to instrument the public methods when they are invoked from code inside the active record module itself. This follows the actual code being executed in reality, but create large traces with many spans:
create
span withn
children of same operationcreate
with child ofsave
each. This, I suspect, is not intuitive to the user which was only invokingcreate
once. I was confues myself until I looked at the rails source code to make sense of what I am seeing.I post here this issue to hear opinions and consult if you are open to changing this behavior to only include functions invoked directly by users. This can also be implemented as instrumentation config option I guess.
The text was updated successfully, but these errors were encountered: