Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(instrumentation-pg): support custom SQL Commenter attributes #1454

Conversation

joelmukuthu
Copy link

Which problem is this PR solving?

Short description of the changes

@joelmukuthu joelmukuthu requested a review from a team April 3, 2023 09:00
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Apr 3, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@github-actions github-actions bot requested a review from rauno56 April 3, 2023 09:01
@@ -47,6 +47,8 @@ function arrayStringifyHelper(arr: Array<unknown>): string {
return '[' + arr.toString() + ']';
}

export type SqlCommenterAttributes = Record<string, string>;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be preferred that I list all the known attributes instead of the index signature? Or both?

const customSqlCommenterAttributes = context
.active()
.getValue(
Symbol('customSqlCommenterAttributes')
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this bit, any suggestions are welcomed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work. This names the symbol but it is still unique each time it is called. You would need to use Symbol.for in order to get the same symbol each time, but I think this approach loses some of the advantage of using a symbol.

Using context seems like a reasonable way to go about this but I haven't seen any other instrumentations do this in any language I think. I think the idea of using context to inject things into instrumentations has merit, but maybe the mechanism can be generalized to other instrumentations as well.

@open-telemetry/javascript-maintainers what would you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would need to use Symbol.for in order to get the same symbol each time, but I think this approach loses some of the advantage of using a symbol.

You're quite right, I was forced to use a Symbol here because that's what the API expects. I think if we came up with a standard way of using the context to pass data to instrumentations, then there'd be a different API here.

I'm not sure I can offer much insight into what the API could look like but I'm keen to hear what the others think and very happy to implement what is agreed upon 🙌

@pichlermarc
Copy link
Member

Hi @joelmukuthu, thank you for opening this PR 🙂
In order for us to proceed with the PR, the CLA needs to be signed first (see #1454 (comment) above). 🙂

@joelmukuthu
Copy link
Author

@pichlermarc my bad. thanks for the info, now signed!

@codecov
Copy link

codecov bot commented Apr 4, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.87%. Comparing base (9268716) to head (be38c3e).
Report is 464 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1454      +/-   ##
==========================================
- Coverage   96.13%   95.87%   -0.27%     
==========================================
  Files          14       18       +4     
  Lines         906     1235     +329     
  Branches      197      270      +73     
==========================================
+ Hits          871     1184     +313     
- Misses         35       51      +16     
Files Coverage Δ
...elemetry-instrumentation-pg/src/instrumentation.ts 91.21% <100.00%> (ø)
...node/opentelemetry-instrumentation-pg/src/utils.ts 98.33% <100.00%> (ø)

... and 2 files with indirect coverage changes

@github-actions
Copy link
Contributor

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Jul 24, 2023
@blumamir blumamir removed the stale label Aug 10, 2023
@blumamir
Copy link
Member

Hi @joelmukuthu ,
Sorry for this PR sitting here so long without reviews.

Are you still interested in this? if so, please rebase and resolve conflicts and I'll review it so we can merge soon.

Thanks and sorry again for the delay.

@joelmukuthu
Copy link
Author

joelmukuthu commented Aug 18, 2023

No worries @blumamir, I understand that this is a slow cooker. I'm happy to rebase it, but the main blocker here is that we don't yet have an agreed-upon way of passing custom attributes into the instrumentation. I suggested doing it via the context and as there hasn't been a lot of opposition to that, I'll give it a go as soon as I can (early September) at which point this will be ready for review.

@blumamir
Copy link
Member

No worries @blumamir, I understand that this is a slow cooker. I'm happy to rebase it, but the main blocker here is that we don't yet have an agreed-upon way of passing custom attributes into the instrumentation. I suggested doing it via the context and as there hasn't been a lot of opposition to that, I'll give it a go as soon as I can (early September) at which point this will be ready for review.

From reading your issue, I understand that you plan to use this feature to inject these attributes?

action, controller, framework and route.

If I understand the motivation correctly, the reason you choose to use context is that, for example, nestjs instrumentation can inject these values into the context to be picked up later on by downstream pg instrumentation?

Or do you imagine some middleware that the user will need to hook into the framework? or will it be completely manual from user code with no automatic support from OTEL packages?

I think in this case it does make sense to inject these values via context. Since the values are not static, and change from operation to operation (for example, the route), I don't see any other way which is more elegant. To my understanding, context api was designed for these cases exactly to propagate useful information downstream.

There are a few existing examples of the usage of context in the project:

  1. rpc-metadata
  2. suppress-tracing
  3. baggage which is an opentelemetry speced API.

What they all have in common is that they define the type and getters/setters in a neutral package which can later be consumed by: users or other instrumentation that want to set these values upstream, users and other instrumentations that want to get/read/use these values downstream.

The one thing that I am not comfortable with is that the way you currently access the context key is via "customSqlCommenterAttributes" which makes it specific to sqlCommenter. This would mean that we need to inject this value in many places and if other instrumentation will also need access to "framework" or "controller", they would have to read it from the SqlCommenter key which is not great, or will have to define there own type and key which means duplicating the data many times for each consumer.
I would even consider if you can extend the type RPCMetadata, which already claims to track the route, and add more info there? so that each instrumentation can additionally describe it's framework, controller etc and pg instrumentation will just read the values from there and inject them as SQL comments?

@blumamir
Copy link
Member

And one more question, what are these values use for?
I can understand how the traceparent and tracestate are used to emit telemetry such as new spans, but what happens with the other attributes such as route? is it only for logging the queries and gaining more context when examining a single statement in pg logs? or is it picked up by pg server somehow and indexed to be useful?

Since we already have the traceparent, the user can pick up any span and follow the complete context upstream to see the routes, frameworks, and the full story on what executed the query, and not just hand-picked attributes, no?

@joelmukuthu
Copy link
Author

@blumamir thank you so much for sharing the info you did! It's great to know that this kind of thing is what the context is designed for. I'll look into the examples you shared and suggest an approach (on this PR) to solving the problem the PR is trying to solve.

Wrt what problem this PR is actually trying to solve, yes, AFAIK these values are used to provide insights into query performance or just logged by other tools downstream. For instance, GCP's Cloud SQL captures the values and aggregates query performance:

image

Ref.

As for how I imagined this feature being used, it's exactly as you anticipated: the values can be populated by other instrumentation or manually by the user e.g. in some kind of middleware (e.g. express middleware).

The one thing that I am not comfortable with is that the way you currently access the context key is via "customSqlCommenterAttributes" which makes it specific to sqlCommenter.

Great feedback! My choice with the wording here was to try and minimise the impact or not ruffle too many feathers. But yes, there's nothing about these values (at least at the time of this writing) that is specific to SQL. I agree that a more generalised name is better.

It looks like extending RPCMetadata could solve the need here quite well. On that note, what do you think about using RPCMetadata itself? Else, if we go with extending it instead, do you have any suggestions on what name we could give it? I'm struggling to come up with a name, perhaps because I'm too close to the "SQL Commenter" problem.

@JamieDanielson
Copy link
Member

Thank you for working on this and sorry for the delay! We would need to do more research on this to be more confident in how this will work generally and be maintained. We're concerned that there may not be a common enough use case in the community for this feature, and we're hesitant to take it on right now. To effectively set expectations, we'd like to hold off on implementing this PR for now and if there is enough interest we can revisit in the future for a more generic option.

@joelmukuthu
Copy link
Author

@JamieDanielson thanks for the response. Just so I understand better, is it the "SQL Commenter" use-case that's unpopular or is there not enough justification to access attributes stored in the context from within instrumentations?

@JamieDanielson
Copy link
Member

@JamieDanielson thanks for the response. Just so I understand better, is it the "SQL Commenter" use-case that's unpopular or is there not enough justification to access attributes stored in the context from within instrumentations?

At the moment we maintainers have a limited bandwidth and as we are not well-versed in this feature, there is a lot more research we have to do to be comfortable adding and maintaining this change. We are currently trying to limit scope to focus on bug fixes and stabilization efforts before adding too many features to instrumentation libraries. I hope this explanation helps.

@joelmukuthu
Copy link
Author

Yes that helps, thanks for responding and for all the work you and your team put in!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add custom attributes to SQLCommenter comments in pg instrumentation
6 participants