Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK SHOULD NOT provide access to a Span's attributes besides its SpanContext #1001

Closed
codeboten opened this issue Aug 17, 2020 · 16 comments · Fixed by #1560
Closed

SDK SHOULD NOT provide access to a Span's attributes besides its SpanContext #1001

codeboten opened this issue Aug 17, 2020 · 16 comments · Fixed by #1560
Assignees
Labels
good first issue Good first issue help wanted release:required-for-ga To be resolved before GA release sdk Affects the SDK package.

Comments

@codeboten
Copy link
Contributor

To be compliant with the tracing spec, the SDK SHOULD NOT provide access to a Span's attributes besides its SpanContext

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#span

@codeboten codeboten added good first issue Good first issue sdk Affects the SDK package. help wanted release:required-for-ga To be resolved before GA release labels Aug 17, 2020
@shovnik
Copy link
Contributor

shovnik commented Sep 18, 2020

Is the intent to make the attributes someone sets for a span harder to access using name mangling, because usually in python, making a user set properties private and only accessible with getters/setters seems strange. In python's case, can we not just say not to access the attributes in the doc and trust the user?

@anuraaga
Copy link

I agree with @shovnik in that it's best not to go too far if something isn't idiomatic in the language, or would add overhead. The recommendation is SHOULD NOT to reflect this expectation I think.

@lzchen
Copy link
Contributor

lzchen commented Sep 20, 2020

This issue might be related. I think either approach of setting the properties to private (with a getter/setter) or trusting the user with how they interact with the API + include detailed documentation is fine. The important thing to ensure is that we are consistent with our approach, and not have a mixture of these two in our codebase.

@shovnik
Copy link
Contributor

shovnik commented Sep 21, 2020

If the spec states that the SDK should not provide access the a Span's attributes besides the SpanContext, wouldn't making the properties private (not sure if fully private is possible) and adding a getter would provide access to the attributes anyways? I might be misunderstanding the first approach you suggested @lzchen .

@lzchen
Copy link
Contributor

lzchen commented Sep 21, 2020

@shovnik
I was referring to the concept of "enforcing users to interact with our API" rather than doing whatever they want. There are several places in which this happens (calling Tracer() directly instead of trace.get_tracer(), accessing span.attributes directly instead of (potentially) span.get_attributes(), and in your case, simply calling span.attributes. It is difficult to prevent users from actually doing this, the question is whether or not we want to "make it difficult" for them to do this (with whatever mechanism we come up with i.e. name mangling, private attributes with gettors/setters, adding hidden arguments like here to prevent instantiation) or we are fine with just explicitly documenting our API and telling users you MUST interact with Opentelemetry ONLY using our APIs, and trust that they use it properly. Probably something to bring up during the weekly SIG.

@lzchen
Copy link
Contributor

lzchen commented Oct 29, 2020

srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this issue Nov 1, 2020
@lzchen lzchen assigned shovnik and unassigned aabmass Dec 10, 2020
@shovnik
Copy link
Contributor

shovnik commented Dec 11, 2020

The spec requires adding a ReadOnly and ReadWrite span interface for use cases where the sdk needs to read back span attributes as specified in: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#additional-span-interfaces. However, right now, if I add an underscore in front of all span attributes to discourage accessing (default span being WriteOnly), there will be cases in exporters and tests where span._attributes, span._resource or other private properties is accessed. I have always understood that properties starting with an underscore should not be accessed outside a class. Is that fine or should I be adding ReadOnly and ReadWrite span classes in this PR (seems out of scope).

@Oberon00
Copy link
Member

I am not sure anything needs to be done here for Python. IMHO it already implements the spec correctly: If you look at the API Span interface, there are no getters/properties/attributes that shouldn't be there. And the SDK span is the ReadWrite and Readable Span at the same time (which is totally allowed).

But I agree with @lzchen in #1001 (comment) that it is currently maybe a bit too easy and intuitive to do the wrong thing. E.g. doing for attribute in span.attributes will probably work currently, and because this is Python, you don't need to cast span to anything. I think one way of improving this would be to return a span from start that only has a single attribute _data and that _data is the ReadWrite span. All methods inside the returned span could then just delegate to _data. At places where a Readable or ReadWrite span is required in the spec (namely the SpanProcessor callbacks), you don't pass the returned span but the _data member.

@shovnik
Copy link
Contributor

shovnik commented Dec 14, 2020

I might be misunderstanding your suggestion, but wouldn't delegating all the span methods to its _data attribute require a second Span class anyways since the current Span implementation which will be stored in _data does not delegate these methods (would be recursive). Instead of having a WriteOnly(default), ReadOnly and ReadWrite Span, we would have a ReadWrite Span stored inside of every WriteOnly span.

On another note, would it be fine to just make all the span properties "private" by starting their names with an underscore, but disregarding this for any usage within the SDK or is that bad practice. That would probably be the simplest solution to use the Span as ReadWrite internally but make it clear that it is meant to be WriteOnly when used through the API since it only has a setter.

@Oberon00
Copy link
Member

I might be misunderstanding your suggestion, but wouldn't delegating all the span methods to its _data attribute require a second Span class anyways since the current Span implementation which will be stored in _data does not delegate these methods (would be recursive). Instead of having a WriteOnly(default), ReadOnly and ReadWrite Span, we would have a ReadWrite Span stored inside of every WriteOnly span.

I think you understood it correctly. I would most likely need two Span classes. Though the ReadWrite span does not need to inherit from Span, it would be nice if it did.

would it be fine to just make all the span properties "private" by starting their names with an underscore, but disregarding this for any usage within the SDK or is that bad practice.

That would be fine, but these properties are also needed outside the SDK: They must be available in user-implemented SpanProcessors and in exporters (which are not part of the SDK but separate components using the SDK).

@shovnik
Copy link
Contributor

shovnik commented Dec 15, 2020

To give users outside the SDK access to the now nested ReadWrite Span, wouldn't we have to add a new generate_read_write_span method to the span interface (which just returns _data? And if the default Span should now be considered ReadWrite, wouldn't we have to add a Setter to it as well?

Also, on a side note, wouldn't this approach make it harder to implement a ReadOnly Span or is that completely optional given we have a ReadWrite one.

@Oberon00
Copy link
Member

Oberon00 commented Dec 17, 2020

To give users outside the SDK access to the now nested ReadWrite Span, wouldn't we have to add a new generate_read_write_span method to the span interface

No, actually that is the thing we MUST NOT do. In all the places where the spec requires a Readable span or a Readwrite span, it is as a parameter of a callback method that is invoked by the SDK. So e.g. in the SDK's def end(self): for Span you could do self._span_processor.on_end(self._data) (pseudo code). So the conversion happens inside the SDK.

@shovnik
Copy link
Contributor

shovnik commented Dec 17, 2020

Alright, I think I understand the idea, thanks. I will make a PR for this moment I have some time.

@toumorokoshi
Copy link
Member

Reading through this, I'm not clear we need the additional complexity of a wrapper span.

If the goal is to reduce the likelyhood of direct attribute access, and to encourage users to blessed methods on some ReadWrite span interface, why not:

  • introduce getters methods on the sdk.Span
  • move all attributes to protected

Regardless some getter methods will have to be added to the Readable span. Otherwise you have to modify code to unpack and access private values anyway: https://github.com/open-telemetry/opentelemetry-python/pull/1501/files#diff-7cc84d16e9b6ed7a1d76f166033c24123f01d0f5045d3b770d69ff5ff4cc70daR407.

@Oberon00
Copy link
Member

The readable span can just expose the attributes directly like now.
Code modification should only be required for unit tests that read attributes from a write-only span.
See my comment above #1001 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good first issue help wanted release:required-for-ga To be resolved before GA release sdk Affects the SDK package.
Projects
None yet
7 participants