Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spike: web performance #13299

Closed
wants to merge 85 commits into from
Closed

spike: web performance #13299

wants to merge 85 commits into from

Conversation

pauldambra
Copy link
Member

@pauldambra pauldambra commented Dec 13, 2022

Problem

see RFC

Users want to see network requests and performance facts alongside other information

Changes

  • adds a schema to store un-aggregated performance facts
    • this is intended to be queried by session id and/or pageview id and return time ordered facts for that id combination
    • in future we can build aggregating materialized views from this to provide dashboards for performance info
  • updates the session recording player to show these network logs alongside recordings
    • this uses spark lines to show relative amount of wall time taken by requests
  • adds an ingestion route for new the $performance_event events

works with

todo

How did you test this code?

locally only so far

@pauldambra
Copy link
Member Author

pauldambra commented Dec 13, 2022

migrations were silently failing because I was ordering by mis-spelled column names 🤷

posthog/performance/schema.py Outdated Show resolved Hide resolved
posthog/performance/schema.py Outdated Show resolved Hide resolved
PERFORMANCE_EVENTS_TABLE_SQL = (
PERFORMANCE_EVENTS_TABLE_BASE_SQL
+ """PARTITION BY toYYYYMM(origin_timestamp)
ORDER BY (team_id, toDate(origin_timestamp), session_id, pageview_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problems:

  1. Looking up (only) via pageview_id with this ORDER BY is really really expensive.
  2. Looking up via session_id requires inspecting more granules than needed.

Re 1, could we always include session_id in the where clause when looking up by pageview_id?

Re 2, could we also add a filter on origin_timestamp to all the point queries?

posthog/performance/schema.py Outdated Show resolved Hide resolved
posthog/performance/schema.py Outdated Show resolved Hide resolved
posthog/performance/schema.py Outdated Show resolved Hide resolved
} finally {
clearTimeout(timeout2)
}
}
} else if (data['event'] === '$performance_event') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending these events in via the main topic is not a good idea or it sharing the rest of the ingestion pipeline.

Similar to session recordings, having ingesting these events be batched/buffered/create persons/update properties/be processed by plugins touches a very complex part of an application you don't want to be touching. While adding this here speeds you up short-term you're creating problems for yourself and #team-pipeline as soon as it needs to get untangled.

Also the capture endpoint has custom partitioning logic that's not relevant for performance event.

This needs a rethink: I suggest removing it from this PR or putting it behind a flag completely, with the understanding that this will be reviewed by the proper team.

posthog/performance/sql.py Outdated Show resolved Hide resolved
@pauldambra pauldambra changed the title spike: CH schema for web performance spike: web performance Dec 14, 2022
@@ -157,6 +157,7 @@ def opt_slash_path(route: str, view: Callable, name: Optional[str] = None) -> UR
opt_slash_path("capture", capture.get_event),
opt_slash_path("batch", capture.get_event),
opt_slash_path("s", capture.get_event), # session recordings
opt_slash_path("p", capture.get_event), # performance events
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely flyby and could warrant its own RFC, but we're growing a pretty big list of urls that point to the same endpoint. We'll need to update the ingress rules and the middleware short circuit to account for this new url.

For operational simplicity (and, anecdotally, for routing performance), could we agree on consolidating all new intake urls into a common /i/ path (/p -> /i/p), so than we can easily send every /i/* request to the event pool's get_event and call it a day?

Longer term, I'd like us to move the existing endpoints too, but that's RFC material for later (how to coordinate SDK and backend releases, although nginx rewrite rules can help address the long tail).

@posthog-bot
Copy link
Contributor

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

@benjackwhite
Copy link
Contributor

Closing as all the code is implemented in other PRs now 👍

@benjackwhite benjackwhite deleted the feat/web-performance-schema branch January 4, 2023 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants