-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shorten field names in the RUM V3 spec #3414
Conversation
be26e17
to
681a56f
Compare
docs/spec/rum_v3_context.json
Outdated
} | ||
} | ||
}, | ||
"framework": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might start sending these field since we already have framework specific information. The reason why they are not sent right now its mainly bcoz of payload size. Can we also shorten these fields
docs/spec/rum_v3_context.json
Outdated
"type": ["string", "null"], | ||
"maxLength": 1024 | ||
}, | ||
"runtime": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these could be prefilled from user agent since that is the runtime for the RUM agent.
docs/spec/spans/rum_v3_span.json
Outdated
"type": ["boolean", "null"], | ||
"description": "Indicates whether the span was executed synchronously or asynchronously." | ||
} | ||
}, | ||
"required": ["duration", "name", "type", "id","trace_id", "parent_id"] | ||
"required": ["d", "n", "t", "id","trace_id", "parent_id"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trace_id
and parent_id
must be optional right? . Or will that be part of a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that is job for another pr :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good @jalvz. Dont really belong here, Can we make this v3 spec backwards compatible? Newer versions of APM server would be able to understand the shortened vs unshortened fields (if users is using old versions of the agent)
Here are some fields not shortened so far:
You mentioned framework, what about the rest? I am quite sure that some can be dropped, but wanted to hear from you |
Fields that RUM would never use it in for the foresable future.
Needs changes
// Might be used by Server when souremaps is present
|
Awesome, thanks! I'll investigate the |
08ead1c
to
4090f73
Compare
Codecov Report
@@ Coverage Diff @@
## master #3414 +/- ##
==========================================
+ Coverage 79.29% 79.41% +0.12%
==========================================
Files 109 109
Lines 5751 5772 +21
==========================================
+ Hits 4560 4584 +24
+ Misses 1191 1188 -3
|
4090f73
to
7854c89
Compare
processor/stream/test_approved_es_documents/testIntakeRUMV3Errors.approved.json
Outdated
Show resolved
Hide resolved
}, | ||
"id": "ec2e280be8345240", | ||
"marks": { | ||
"a": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ill create a ticket to define these in a follow-up pr
schema: er.RUMV3Schema, | ||
modelDecoder: er.DecodeRUMV3Event, | ||
}, | ||
}, | ||
metadataSchema: metadata.RUMV3ModelSchema(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metricset
model is missing here, I'll file a ticket for that to follow up in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Intake JSON schema defines all of the fields that will be processed (with the exception of custom
, which is not indexed). We explicitly allow additional fields to keep future versions of the agents compatible with older server versions. In case additional fields are sent, they are not processed and neither indexed. With these changes the Intake JSON schema for RUM only validates a part of the fields that are processed, as some fields were removed from the spec, but all Intake endpoints share the same model logic. Since the RUM endpoints are not protected, there is also no protection from someone else sending up fields processed but not JSON schema validated.
This might be a rather theoretical concern, so I am not asking for any related changes in this PR, but would like to get a discussion going about possible implications.
model/fields/rum_v3_mapping.go
Outdated
} | ||
return s | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about checking shortFieldNames
when creating the function rather than on every field name call? Something like:
func Mapper(shortFieldNames bool) func(string) string {
if shortFieldNames {
return func(s string) string {
if shortField, ok := rumV3Mapping[s]; ok {
return shortField
}
return s
}
}
return func(s string) string {
return s
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is more complex and the performance impact is negligible. I can benchmark it if you think is worth it, tho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect it to only be a couple of nanoseconds difference per call, when shortFieldNames == false
, but it is called for every decoded field, which is potentially a couple hundreds for larger events. This won't mage a big difference, especially as no allocations are involved, but the suggested solution is fairly straight forward, so I would rather go with that and avoid unnecessary access of the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all things equal ("equal" meaning +-200ns), I rather chose the simplest implementation if you don't mind (or avoid unnecessary complexity, put it another way).
@@ -16,7 +16,7 @@ | |||
"type": ["string", "null"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you though about creating a dedicated docs/spec/rum_v3/
folder that contains all the specs? Not a requirement if you prefer to keep it as is, but I think it would be easier to track if there are still references to unshortened specs and generally easier to navigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong preference, but mixing 2 taxonomies could be more confusing... eg. rum_v3_transactions
could still fit in either transactions
or rum_v3
, not sure why one would make more sense over the other...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the first time the endpoint and version are part of the spec name, therefore I think the suggested grouping makes sense. Just a suggestion though, up to you how to finally organize it.
model/metadata/service.go
Outdated
if input == nil || err != nil { | ||
return nil, err | ||
} | ||
raw, ok := input.(map[string]interface{}) | ||
if !ok { | ||
return nil, errors.New("invalid type for service") | ||
} | ||
field := fields.Mapper(hasShortFieldNames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could initialize the decoder with the mapping information decoder(fields.Mapper(hasShortFieldNames))
, allowing to abstract away the mapper inside the decoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that, but there are a lot of small functions (decodeHTTP
, decodeService
, decodeStacktrace
, etc) that would need to be adapted to be methods instead, meaning that they would not be reusable (eg. duplicated metadata.DecodeService
, transaction.DecodeService
, span.DecodeService
, etc.).
At the end thought it was not worth it and this produced the smallest diff.
processor/stream/test_approved_es_documents/testIntakeRUMV3Errors.approved.json
Show resolved
Hide resolved
So, for my understanding, what are those implications? |
model/context.go
Outdated
@@ -125,7 +125,7 @@ func DecodeContext(input interface{}, cfg Config, err error) (*Context, error) { | |||
} | |||
|
|||
decoder := utility.ManualDecoder{} | |||
field := fields.Mapper(cfg.HasShortFieldNames) | |||
field := field.Mapper(cfg.HasShortFieldNames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The var field
is shadowing the package name now. How about changing the variable to mapper
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... amended
Created #3481 to bundle the discussion started above and not block this PR. |
f55c541
to
5480fb2
Compare
ec85678
to
efe8ca1
Compare
Some pending decisions:- What to do with fields not used by RUM currently (leave them / remove them / shorten themanyways). Ill come up with the list of fields, and upload also a file with the mappings.- What to spec inmarks.json
, and weather to make a separate spec for RUM V3.Fields removed in this PR (not used by RUM):
response.finished
response.headers_sent
request.body
request.socket
request.cookies
ephemeral_id
service.node
span.db
stacktrace.vars
stactrace.library_frame
Closes #3403