Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RUM] Payload optimisation for the RUM agent #2854

Closed
7 tasks done
vigneshshanmugam opened this issue Oct 23, 2019 · 8 comments
Closed
7 tasks done

[RUM] Payload optimisation for the RUM agent #2854

vigneshshanmugam opened this issue Oct 23, 2019 · 8 comments

Comments

@vigneshshanmugam
Copy link
Member

vigneshshanmugam commented Oct 23, 2019

In the RUM agent, we are experimenting with optimising the payload that is sent to the APM server. Currently we have done some prototypes which you can find in the repo here - https://github.com/vigneshshanmugam/rum-agent-payload

We have planned to do these changes in two steps

Phase 1

Phase 2

  • Since we don't have the flexibility of doing compression on the browser side, we are learning towards shortening field names and optimising the structure of payload even further. One example would be to rename transaction to t in the top level and constructing spans as tries.

We are working on a proposal doc at the moment, but the repo already contains some numbers for the phase 1 which would already get close to 50% improvement from the current release.

Update 12/03 (Juan)

  • We will shorten all fields that RUM can potentially use.
  • We won't change the format from NDJSON to JSON, but embed spans inside transactions, or build logic around order of events instead.
  • We possibly won't pursue the trie approach (is a bit complicated unfortunately).
  • This can't be done backwards compatible from server side.

Update 17/04 (Juan)

Recap of some relevant design decisions taken:

  • All marks and metricset fields are shortened. Any field not in the spec will be ignored (not indexed, but not rejected either).

  • Spans are embedded inside transactions. trace_id and transaction_id have been removed from spans. Instead of parentId, there is a "parentIndex" (shortened to pi) which refers to the index of the parent span in the span list. If missing, the parent is the transaction.

Issues

@vigneshshanmugam
Copy link
Member Author

vigneshshanmugam commented Nov 13, 2019

After our discussion. We did an experiment with keeping the format same as NDJSON and only applying the field level compression on the payload. The wins are pretty similar. Check the result here https://github.com/vigneshshanmugam/rum-agent-payload#alternative-strategy

@jalvz
Copy link
Contributor

jalvz commented Feb 27, 2020

Sub-tasks: #3404, #3403

@jalvz
Copy link
Contributor

jalvz commented May 26, 2020

All done! Let me know if something else comes up.

@jalvz jalvz closed this as completed May 26, 2020
@zube zube bot added [zube]: Done and removed [zube]: Meta labels May 26, 2020
@axw axw removed the [zube]: Done label May 27, 2020
@bmorelli25
Copy link
Member

@jalvz / @vigneshshanmugam

What documentation updates should be considered for this change? For example, I'm looking at our Events intake API docs and notice that we still reference v2 of the RUM events endpoint. Should I update that to v3?

What about the example request body? Should we include https://github.com/elastic/apm-server/blob/master/docs/data/intake-api/generated/rum_v3_events.ndjson?

Let me know if there's anything else I'm missing and I'll open a PR.

@vigneshshanmugam
Copy link
Member Author

@bmorelli25 The RUM agent has a configuration apiVersion which can be set to 3 to send the events to the v3 Intake API. https://www.elastic.co/guide/en/apm/agent/rum-js/current/configuration.html#api-version

May be we can link to the RUM documentation and include side by side of how they fields are compressed? Just an idea though or we can not update anything and just keep the request body as is.

@bmorelli25
Copy link
Member

May be we can link to the RUM documentation and include side by side of how they fields are compressed?

I like that idea. How would I be able to generate that side-by-side comparison? I only see a generated/rum_v3_events.ndjson file.

@axw
Copy link
Member

axw commented Jul 21, 2020

FWIW, I don't think it's particularly important to fully define the RUM v3 API in our docs. Maybe I'm wrong, but I think the likelihood of external folks developing against that API vs. the existing one are pretty minimal.

@vigneshshanmugam
Copy link
Member Author

How would I be able to generate that side-by-side comparison? I only see a generated/rum_v3_events.ndjson file.

Full mappings from V2 to V3 - https://github.com/elastic/apm-server/blob/master/model/modeldecoder/field/rum_v3_mapping.go#L20-L125

I agree with @axw comments, the likelihood of external folks developing would be pretty minimal. But we can also link the V3 spec mappings to let others know that there is a possibility of doing this to reduce the bandwidth consumption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants