Storing session data on Resource #2500

martinkuba · 2022-04-19T13:49:52Z

The client-side SIG is working on adding the concept of sessions. Sessions correlate signals from the same client / user over a certain period of time.

We have considered two options for attaching session data:

attributes on resource
attributes on individual signals

We are leaning towards resource attributes because the session values are mostly the same for every signal. It is important in client-side applications to minimize the size of the payload.

The challenge with using resource attributes is that sessions can change during the lifetime of the application, e.g. session can be started or ended while the application is running.

We are wondering what it would take to update the spec to allow updating or replacing the resource. This has been discussed at length in this issue (#1298).

yurishkuro · 2022-04-19T14:13:37Z

I disagree that session info belongs to the resource. Efficiency considerations should not override proper data modeling decisions. Resource is meant to identify the service, which doesn't change while that service handles many user sessions.

Conceptually user session belongs at the trace level, which we do not have in the OTEL span model (some systems do support trace level attributes), and it's customary to place those on the root span.

tedsuo · 2022-04-20T16:25:28Z

I agree that trace level attributes would be a good addition to OTel, but to me session IDs still look like they are resources:

A user session applies to all telemetry, not just traces. Session IDs also apply to logs/events, metrics, etc. Not all of that telemetry will be encapsulated in a trace.
Since resources act as an envelope containing the attributes which are common to all telemetry emitted in a batch, it makes sense for Session ID to be a resource.

There are also some practical implications.

There is no "root trace" or "root context" to attach a trace-level session ID to.
In practice, not setting the session as a resource forces every piece of instrumentation which starts a trace or creates a log to become "session aware" which complicates all of our instrumentation.
Creating span and log processors which attaches the session ID as an attribute to every object looks like a very poor re-implementation of resources – this information belongs in an envelope which applies to all telemetry currently being emitted.

Beyond just sessions, we've also discovered that clients may have other resources which can change over time - timezone and other location data, language preference, etc. Mobile and desktop clients are rarely rebooted; instead they are put to sleep and later re-awaken in an environment where some settings may have changed. We did not consider these client-specific issues when we originally defined how resources should work.

We've discovered all of the above issues while attempting to create a model for client instrumentation/RUM, which is why we're proposing updatable resources as a solution. The Client/RUM SIG can create an OTEP to further explain how an updatable ResourceProvider would solve these issues. But it would be good to understand what side effects this change could create, and how we can mitigate them.

BTW on a related note, I am also seeing a need to have a stable resource attribute for identifying a service/sdk instance. Currently there is no required attribute which would serve this purpose – service.instance.id is optional and telemetry.sdk.id does not exist. A stable instance ID which is always present would work better for identifying individual services than our current practice of saying all resources must be immutable for the life of a service.

yurishkuro · 2022-04-20T16:39:37Z

Fair enough. However, rather than proposing a change to the spec, I recommend starting with a sample implementation and trying out different approaches. There are different ways of how this can be achieved

mutable resources (could mean significant implications to collection)
replaceable resources in the Tracer/etc
reinitializable Tracer/etc
an alternative path of exporting resources
...

tedsuo · 2022-04-22T00:35:50Z

Yes I totally agree we need to prototype as part of making this proposal! We have a prototype implementation in the works, taking the following approach:

Extract resources into a ResourceProvider
Tracers and other telemetry generators now call ResourceProvider.resources() when starting spans, creating logs, etc.
ResourceProvider.update() creates a new resource set. Future calls to ResourceProvider.resources() now return this new set. This does not mutate existing resource sets.
Resource sets are still immutable. Resource sets never change once they are attached to spans, logs, etc, so they can still be stored and accessed as a thread-safe pointer.
Exporters already have to create multiple batches of data, sorted by resource set, due to the fact that multiple providers may be sharing the same Exporter. So nothing else in the SDK architecture needs to change, beyond the addition of the ResourceProvider concept.

Once we have a ResourceProvider, it's easy enough to create a SessionManager that updates the resources whenever the session changes. The same approach can be taken to managing other resources that need to be updated when a mobile client reawakens.

The ResourceProvider seems like an effective solution with a limited "blast radius" in terms of what parts of Otel are affected, so we are going to propose this as part of adding RUM support to OTel.

We plan on actually be creating three OTEPs, all with prototypes and examples. Most of this info is not relevant to this current issue, but just as an fyi the three OTEPs will be:

OTEP for describing how the RUM concept should be defined within OpenTelemetry. It will point out that two things currently missing from OTel are a well-defined place to efficiently attach session information, and a well-defined way to record events as logs (since traces are not always present when these events happen).
OTEP for adding the ResourceProvider concept described above.
OTEP describing how events should be recorded as logs. We've already talked to the Log SIG about this so it's won't be an out-of-the-blue proposal.

dyladan · 2022-04-26T12:24:55Z

Maybe the newly proposed instrumentation scope attributes is a more natural place for this open-telemetry/oteps#201

Oberon00 · 2022-04-26T12:25:13Z

Please read the discussion at #1298: This shows that already just appending to the resource poses some interesting (but IMHO solvable) questions.
Changing/replacing values on the resource is IMHO semantically wrong.

Maybe you want to send something in OTLP at the level where currently there is only resource. But the resource concept itself is incompatible with mutability.

martinkuba · 2022-05-06T16:14:18Z

Would there be any challenges with specifying some resource attributes as identifying (immutable) and some as descriptive (mutable) (as discussed in #1298)? Are there any backends that actually use all resource attributes (e.g. by hashing) to determine identity?

jsuereth · 2022-09-16T13:47:21Z

So even with a split of resource attributes between identifying and descriptive, the issue here is what a Resource means and what a session means. I still think you're looking for a (currently non-existence) scope/context-based attribute. Where "scope" here means lexical-scope / or scoped in context (not the same as InstrumentationScope).

Specifically -

When a browser session is created we can attach the session id to context and output it appropriately.

Specifically, my main concern with the notion of descriptive attributes on resource vs. "session" is that session can change for the same device, so there's an element of needing to know when an attribute was live for this to work.

@tigrannajaryan and I are working on some refinement to Resource around identifying/non-identifying attributes. I still think this won't solve the client-side instrumentation issue.

For us to make progress, would be willing to help us identify a few things?

Is the identify of a device / browser important or just the identity of the session? E.g. will metrics be generated for a specific device or just a specific session?
Does ALL telemetry emitted need the session labels, or just events (spans/logs)?
2a. If a session is a span of time w/ attached attributes, is that better modelled as a Span to which you 'link' telemetry?

t2t2 · 2022-09-16T19:45:43Z

Note: I also ended up writing a longer comment on why this really really should be included in resources instead of scopes, but I figured it's better in Ephemeral Resource Attributes otep (oteps#208) (as this issue really morphed into that otep) than here. I recommend reading that comment first since it'd give a lot of useful domain knowledge plus I'll refer parts of it again in this comment:

open-telemetry/oteps#208 (comment)

1: Yes. No. Depends. Maybe. Sometimes. Eh dunno

Since you're here from an issue that talks about metrics, let's say very browser specifically web vitals were sent over metrics were sent over metrics - here you'd have one value for lifetime of a page, that describes that one page load. So if you want, quoting Tigran, "minimal globally unique identifier of the emitting entity", using service.instance.id would likely be the best default value. But for querying, getting avg/median/percentile values you're gonna have so many different attributes to base on

When using a RUM...

Sometimes you'd want data from one opened page, sometimes experience during a session

Sometimes you want a specific user's experience, sometimes you want a group of users (eg. employees from one department)

Sometimes you want all users of one ISP, sometimes you want all users from a specific country

Sometimes everything

So anything usable for rum use cases would have more options for per-attribute filtering

2: Yeah sorry no escaping the longer comment for this one. tl;dr a case for yes. Especially when you have a UI that combines all 3 signals to show the experience of using a site - spans for http requests, logs for, well logs and errors, metrics for info like web vitals score.

2a: Ignoring that session.id use isn't limited to spans, I mean I guess if your systems don't mind linking to spans that likely never exist.

Session can easily cross multiple otel sdk instances (so for web a session id span would be made in one instance, used in N more, sometimes in multiple in parallel, and then ended in another instance, see italics note). Additionally otel does not work well for really long spans like this. For one a session end is quite frequently a while into no otel SDK being active (user opens site, closes site, 15 minutes later session expires, if a ~~tree falls in a forest~~ session expires while nobody is observing, does it really expire?). Secondly otel spans are only sent when span ends -- if there's nothing to end the span, the span is never sent. Lastly systems would probably need to ignore hours long session spans so they don't think the slowest thing in your system is a 4 hour session

martinkuba · 2024-04-23T16:32:26Z

Decision has been made to represent sessions as an attribute on all signals. Closing for now.

martinkuba added the spec:resource Related to the specification/resource directory label Apr 19, 2022

github-actions bot assigned yurishkuro Apr 19, 2022

jsuereth mentioned this issue Sep 7, 2022

Refine which attributes of Resource contribute to Metric Identity. #2775

Open

t2t2 mentioned this issue Sep 16, 2022

Ephemeral Resource Attributes open-telemetry/oteps#208

Closed

jmacd mentioned this issue Dec 14, 2023

Should event fields be part of the global attribute registry? open-telemetry/semantic-conventions#505

Closed

martinkuba closed this as completed Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Storing session data on Resource #2500

Storing session data on Resource #2500

martinkuba commented Apr 19, 2022

yurishkuro commented Apr 19, 2022

tedsuo commented Apr 20, 2022

yurishkuro commented Apr 20, 2022

tedsuo commented Apr 22, 2022

dyladan commented Apr 26, 2022

Oberon00 commented Apr 26, 2022 •

edited

Loading

martinkuba commented May 6, 2022

jsuereth commented Sep 16, 2022

t2t2 commented Sep 16, 2022

martinkuba commented Apr 23, 2024

Storing session data on Resource #2500

Storing session data on Resource #2500

Comments

martinkuba commented Apr 19, 2022

yurishkuro commented Apr 19, 2022

tedsuo commented Apr 20, 2022

yurishkuro commented Apr 20, 2022

tedsuo commented Apr 22, 2022

dyladan commented Apr 26, 2022

Oberon00 commented Apr 26, 2022 • edited Loading

martinkuba commented May 6, 2022

jsuereth commented Sep 16, 2022

t2t2 commented Sep 16, 2022

martinkuba commented Apr 23, 2024

Oberon00 commented Apr 26, 2022 •

edited

Loading