feat(cyclotron): Add metrics everywhere #25193

oliverb123 · 2024-09-24T23:49:16Z

Problem

I really want to be able to answer questions like "how many bytes are being flushed to PG per second, per worker/per queue? How many updates are workers sending per flush (whats the flush batch size)?" and right now that's easy in rust land but hard in node land. A lot of the work here is laying the foundations for a stronger metrics reporting story in rust land generally (default labelling all metrics by service is a good start, we can maybe even stop prefixing metrics at some point), and then also getting rust's metrics story and node's to play nicely together.

Part of a broader push to get our delivery engine as easy to operate as possible - next on my list is a simple web UI exposed by janitors for looking at failed jobs, pausing cleanup, that kind of thing (particularly important as our binary blobs become more opaque, e.g. if we add compression, metabase becomes pretty useless) - but metrics first.

oliverb123 · 2024-09-24T23:51:36Z

This is ready for review but there's at least one open question about how labels are handled differently in rust and node land, that I'm simply not sure how to address - would appreciate input or suggestions if anyone's got them.

rust/cyclotron-node/examples/metrics.ts

benjackwhite · 2024-09-25T07:38:45Z

plugin-server/src/cdp/utils.ts

+
+        // The TS library seem to demand you declare labels up-front (for some reason?),
+        // but we have know good way of knowing the set of labels used up-front, so
+        // idk what to do here really - recreate the metric each time? That seems...


this would need to be tested but it could be that the labels are just for typing so if you just cast the label setting as any it might JustWork. Checked the lib source and it wasn't obvious either way if it would work though...

Rip... I'm gonna have to think about it a bit more than that it seems. It's easy to handle this with counter and gauges (I can just delete and recreate), but histograms are a bit trickier.

error: "Error: Added label \"outcome\" is not included in initial labelset: [] at validateLabel (/Users/olly/Documents/work/posthog/plugin-server/node_modules/.pnpm/[email protected]/node_modules/prom-client/lib/validation.js:20:10) at Counter.inc (/Users/olly/Documents/work/posthog/plugin-server/node_modules/.pnpm/[email protected]/node_modules/prom-client/lib/counter.js:23:4) at emitCyclotronMetrics (/Users/olly/Documents/work/posthog/plugin-server/src/cdp/utils.ts:408:21) at CdpProcessedEventsConsumer.processBatch (/Users/olly/Documents/work/posthog/plugin-server/src/cdp/cdp-consumers.ts:408:29) at CdpProcessedEventsConsumer._handleKafkaBatch (/Users/olly/Documents/work/posthog/plugin-server/src/cdp/cdp-consumers.ts:595:20) at processTicksAndRejections (node:internal/process/task_queues:95:5) at func (/Users/olly/Documents/work/posthog/plugin-server/src/cdp/cdp-consumers.ts:338:25) at runInstrumentedFunction (/Users/olly/Documents/work/posthog/plugin-server/src/main/utils.ts:48:24) at eachBatch (/Users/olly/Documents/work/posthog/plugin-server/src/cdp/cdp-consumers.ts:334:24) at startConsuming (/Users/olly/Documents/work/posthog/plugin-server/src/kafka/batch-consumer.ts:305:17) at Object.join (/Users/olly/Documents/work/posthog/plugin-server/src/kafka/batch-consumer.ts:385:13)"

Found the ever-classic 5 year old issue 🙃 siimon/prom-client#298

plugin-server/src/cdp/utils.ts

oliverb123 · 2024-09-26T07:59:31Z

Putting this in draft for now, will return to it in a bit

posthog-bot · 2024-10-04T07:31:08Z

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week. If you want to permanentely keep it open, use the waiting label.

oliverb123 added 6 commits September 25, 2024 01:50

add service/process default labels

3d9378a

forgot the janitor

2339d5d

Add metrics around job creation and update flushing

2132be2

start work on metrics interop

46f8f2a

basic metrics exposing working

f8e5bc4

change histogram approach, stuck on labels

141b56f

oliverb123 requested review from bretthoerner, benjackwhite, mariusandra and MarconLP September 24, 2024 23:51

benjackwhite requested changes Sep 25, 2024

View reviewed changes

this is broken, pausing for now

941161f

oliverb123 marked this pull request as draft September 26, 2024 07:59

posthog-bot added the stale label Oct 4, 2024

oliverb123 added waiting Prevents stale-bot from marking the PR as stale. and removed stale labels Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cyclotron): Add metrics everywhere #25193

feat(cyclotron): Add metrics everywhere #25193

oliverb123 commented Sep 24, 2024

oliverb123 commented Sep 24, 2024

benjackwhite Sep 25, 2024

oliverb123 Sep 25, 2024

oliverb123 Sep 25, 2024

oliverb123 commented Sep 26, 2024

posthog-bot commented Oct 4, 2024

feat(cyclotron): Add metrics everywhere #25193

Are you sure you want to change the base?

feat(cyclotron): Add metrics everywhere #25193

Conversation

oliverb123 commented Sep 24, 2024

Problem

oliverb123 commented Sep 24, 2024

benjackwhite Sep 25, 2024

Choose a reason for hiding this comment

oliverb123 Sep 25, 2024

Choose a reason for hiding this comment

oliverb123 Sep 25, 2024

Choose a reason for hiding this comment

oliverb123 commented Sep 26, 2024

posthog-bot commented Oct 4, 2024