-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spike: web performance #13299
spike: web performance #13299
Conversation
migrations were silently failing because I was ordering by mis-spelled column names 🤷 |
…e-schema' of github.com:PostHog/posthog into feat/web-performance-schema
posthog/performance/schema.py
Outdated
PERFORMANCE_EVENTS_TABLE_SQL = ( | ||
PERFORMANCE_EVENTS_TABLE_BASE_SQL | ||
+ """PARTITION BY toYYYYMM(origin_timestamp) | ||
ORDER BY (team_id, toDate(origin_timestamp), session_id, pageview_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problems:
- Looking up (only) via pageview_id with this
ORDER BY
is really really expensive. - Looking up via session_id requires inspecting more granules than needed.
Re 1, could we always include session_id in the where clause when looking up by pageview_id?
Re 2, could we also add a filter on origin_timestamp to all the point queries?
} finally { | ||
clearTimeout(timeout2) | ||
} | ||
} | ||
} else if (data['event'] === '$performance_event') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sending these events in via the main topic is not a good idea or it sharing the rest of the ingestion pipeline.
Similar to session recordings, having ingesting these events be batched/buffered/create persons/update properties/be processed by plugins touches a very complex part of an application you don't want to be touching. While adding this here speeds you up short-term you're creating problems for yourself and #team-pipeline as soon as it needs to get untangled.
Also the capture endpoint has custom partitioning logic that's not relevant for performance event.
This needs a rethink: I suggest removing it from this PR or putting it behind a flag completely, with the understanding that this will be reviewed by the proper team.
@@ -157,6 +157,7 @@ def opt_slash_path(route: str, view: Callable, name: Optional[str] = None) -> UR | |||
opt_slash_path("capture", capture.get_event), | |||
opt_slash_path("batch", capture.get_event), | |||
opt_slash_path("s", capture.get_event), # session recordings | |||
opt_slash_path("p", capture.get_event), # performance events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely flyby and could warrant its own RFC, but we're growing a pretty big list of urls that point to the same endpoint. We'll need to update the ingress rules and the middleware short circuit to account for this new url.
For operational simplicity (and, anecdotally, for routing performance), could we agree on consolidating all new intake urls into a common /i/
path (/p
-> /i/p
), so than we can easily send every /i/*
request to the event pool's get_event
and call it a day?
Longer term, I'd like us to move the existing endpoints too, but that's RFC material for later (how to coordinate SDK and backend releases, although nginx rewrite rules can help address the long tail).
# Conflicts: # frontend/src/lib/constants.tsx # frontend/src/loadPostHogJS.tsx
This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the |
Closing as all the code is implemented in other PRs now 👍 |
Problem
see RFC
Users want to see network requests and performance facts alongside other information
Changes
works with
PerformanceObserver
to capture eventstodo
update ingress rules https://github.com/PostHog/charts-clickhouse/blob/2cd44dd84d7913255b6c53e863a6941f1dda063f/charts/posthog/templates/ingress.yamlupdate middleware shortcircuiting feat(capture): short-circuit most middlewares before event capture #13319How did you test this code?
locally only so far