-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should net.peer.* describe socket-level peer or "logical" peer (for HTTP and in general)? #2607
Comments
the best known remote endpoint attribute
If this call came through multiple proxies, service meshes, lbs, and whatnot on the way to the remote endpoint, the caller does not know. I.e. in a generic case, it's not guaranteed to be accurate, so that's why it's the best-known peer name. socket-level attribute Let's say another socket-level attribute describes the real destination- instrumentation can know about it, but not necessarily:
It's also not clear at all what such attribute means: proxy? DNS server? service-mesh side-car address? first router? Network issues can happen at any of them. Would this attribute be required? No - in a lot of cases it will be exactly the same as the "best-known remote endpoint" attribute. Summary I suggest we do Option 4+:
|
By "best" you probably mean the logical destination. And "endpoint" would seem a good name for it. net.endpoint.* for logical stuff, net.peer.* for low-level stuff?
I agree with that net.peer.* in that low-level sense should be made non-required everywhere. It is a low-level debugging attribute. But it is more often available than you think, e.g. you might be able to use reflection to access the underlying TCP socket and call getpeername(2) on it https://man7.org/linux/man-pages/man2/getpeername.2.html (that' actually where the "peer" name comes from).
JMS clients often only know the messaging.destination, which may not contain a hostname, e.g. in a messaging setup that is managed by an (EJB) application server. They could set neither a logical nor low-level net.peer.* attribute purely with arguments you would use for connecting.
I don't get that, I don't know Jaeger nor Prometheus query syntax, but if you want to know the remote service name (as in service.name), you would need the peer.service attribute, which is unrelated to net.* but cannot usually be set automatically https://github.com/open-telemetry/opentelemetry-specification/blob/v1.3.0/specification/trace/semantic_conventions/span-general.md#general-remote-service-attributes |
+1 |
I hear that peer.name is important and common for sockets, but then it should be clearly defined for this case only. Correct me if I'm wrong, but It seems socket-level instrumentation should also be clarified beyond address (at least the address family should be there to understand how to understand this address). Until it's clarified
I want to be able to search for (e.g.) http client request duration by the remote endpoint name across all languages and instrumentations. For this, I need one required attribute that all instrumentations set. And I want this to be available without a pipeline that normalizes my data (hence Jaeger or Prometheus, which honestly show you what instrumentations put on spans). Ideally, I want to use the same attribute name for my DB calls and other client communication. Such attribute collected automatically will improve user experience, simplify instrumentation, remove redundancy, and would also optimize most of the pipelines removing the need to support multiple variations.
Using reflection should not be the approach we recommend for general-purpose instrumentations. |
That's right, the result of getpeername(2) would be used as net.peer.ip. On the other hand, the result of gethostname(2) (https://man7.org/linux/man-pages/man2/gethostname.2.html) could be used as net.host.name (but usually it should be enough to set it as host.name on the resource). |
I'm not sure about that. If the library does not have built-in tracing, then this often will be useful. I think it would be a useful guideline in the context of required attributes, that they need to be available on the API-level, i.e. they should be something that the application passes down to a library that does the requested operation, as opposed to something that the library "calculates" internally (and may or may not expose with public getters). |
This is probably a policy decision that should be discussed more broadly. Ease-of-producing vs. ease-of-consuming, reduncancy, etc. |
So what you're saying is that there is no clear guidance on when and how
Update: back to the subj of this issue:
I'd propose renaming the namespace for the socket level things to |
so it's either an optional attribute or opt-in specialized instrumentation. I'm happy we're on the same page here. |
HTTP proxy can report target peer address by adding Edit: as you are changing existing spec for this in non-backward compatible way, please consider use cases which I reported recently:
|
You mean there should also be a net.peer.addr in addition to net.peer.sock.addr and net.peer.name? Could make sense... |
Yes, something like this - both proxy and target addresses may be useful. BTW, I have realized that |
A full proxy chain is something else. An app wants to contact some logical host, that's net.peer.name (+ maybe net.peer.addr which is not currently proposed). To do that it connects to some host on the socket level, that's net.peer.sock.name/net.peer.sock.addr. So the app does have some kind of direct relation to these two hosts. Any hops in-between are of much less relevance. Definitely something that could be added (arrays net.hop.addrs/net.peer.hop.names/net.hop.ports/maybe net.hop.addr_families?) but IMHO out of scope of this issue that mostly aims to clarify/improve how to provide information we already have in some form. |
(from discussion with @lmolkova in #2469 (comment))
In the HTTP protocol, the peer you connect to can be unclear if an HTTP-level proxy is used. In that case, an application opens a TCP connection to the configured proxy host and port, and sends an n HTTP request with a full URL as target to it (
GET http://hostname.example.com/some?thing=1 HTTP/1.1
).In this case, it is not clear whether net.peer.name should be the hostname from the requested URL (hostname.example.com) or of the HTTP proxy the application sends the HTTP request to.
One thing that seems clear is that neet.peer.name and net.peer.ip should be consistent which each other. If net.peer.name is set to a logical peer distinct from the socket-level peer, net.peer.ip would have to be left unset and there would currently be no way to store network-level connection information.
There is also an issue with gRPC which (usually?) goes over HTTP/2, but the RPC conventions require at least one of net.peer.ip or net.peer.name. Clearly, in absence of any other attribute to hold it, the destination hostname would be more useful than the hostname of an HTTP proxy.
An open question to me is, whether this problem is specific to HTTP (and HTTP-based protocols), or if this is common also in other protocols.
net.sock.*
andnet.app.*
namespaces to have unambiguous options.Option 1 would be the easiest. Option 3 would be the most complex, also for processing on the backend, but would be future-proof also in case some protocol defines multiple peers with different roles that you might connect to with the same span. That seems unlikely, so I don't think Option 3 is very useful.
Option 2 is similar to Option 1 and would be preferable if we find that net.peer.* is overwhelmingly used for either network-level or logical peer information, instead of being used for both in different instrumentations/languages. Choosing option 2 would need a survey of some sort to find how feasible it is and choose one or the other option.
The text was updated successfully, but these errors were encountered: