Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cyclotron): Add metrics everywhere #25193

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

oliverb123
Copy link
Contributor

Problem

I really want to be able to answer questions like "how many bytes are being flushed to PG per second, per worker/per queue? How many updates are workers sending per flush (whats the flush batch size)?" and right now that's easy in rust land but hard in node land. A lot of the work here is laying the foundations for a stronger metrics reporting story in rust land generally (default labelling all metrics by service is a good start, we can maybe even stop prefixing metrics at some point), and then also getting rust's metrics story and node's to play nicely together.

Part of a broader push to get our delivery engine as easy to operate as possible - next on my list is a simple web UI exposed by janitors for looking at failed jobs, pausing cleanup, that kind of thing (particularly important as our binary blobs become more opaque, e.g. if we add compression, metabase becomes pretty useless) - but metrics first.

@oliverb123
Copy link
Contributor Author

This is ready for review but there's at least one open question about how labels are handled differently in rust and node land, that I'm simply not sure how to address - would appreciate input or suggestions if anyone's got them.

rust/cyclotron-node/examples/metrics.ts Show resolved Hide resolved

// The TS library seem to demand you declare labels up-front (for some reason?),
// but we have know good way of knowing the set of labels used up-front, so
// idk what to do here really - recreate the metric each time? That seems...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would need to be tested but it could be that the labels are just for typing so if you just cast the label setting as any it might JustWork. Checked the lib source and it wasn't obvious either way if it would work though...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rip... I'm gonna have to think about it a bit more than that it seems. It's easy to handle this with counter and gauges (I can just delete and recreate), but histograms are a bit trickier.

error: "Error: Added label \"outcome\" is not included in initial labelset: []
    at validateLabel (/Users/olly/Documents/work/posthog/plugin-server/node_modules/.pnpm/[email protected]/node_modules/prom-client/lib/validation.js:20:10)
    at Counter.inc (/Users/olly/Documents/work/posthog/plugin-server/node_modules/.pnpm/[email protected]/node_modules/prom-client/lib/counter.js:23:4)
    at emitCyclotronMetrics (/Users/olly/Documents/work/posthog/plugin-server/src/cdp/utils.ts:408:21)
    at CdpProcessedEventsConsumer.processBatch (/Users/olly/Documents/work/posthog/plugin-server/src/cdp/cdp-consumers.ts:408:29)
    at CdpProcessedEventsConsumer._handleKafkaBatch (/Users/olly/Documents/work/posthog/plugin-server/src/cdp/cdp-consumers.ts:595:20)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at func (/Users/olly/Documents/work/posthog/plugin-server/src/cdp/cdp-consumers.ts:338:25)
    at runInstrumentedFunction (/Users/olly/Documents/work/posthog/plugin-server/src/main/utils.ts:48:24)
    at eachBatch (/Users/olly/Documents/work/posthog/plugin-server/src/cdp/cdp-consumers.ts:334:24)
    at startConsuming (/Users/olly/Documents/work/posthog/plugin-server/src/kafka/batch-consumer.ts:305:17)
    at Object.join (/Users/olly/Documents/work/posthog/plugin-server/src/kafka/batch-consumer.ts:385:13)"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found the ever-classic 5 year old issue 🙃 siimon/prom-client#298

plugin-server/src/cdp/utils.ts Show resolved Hide resolved
@oliverb123 oliverb123 marked this pull request as draft September 26, 2024 07:59
@oliverb123
Copy link
Contributor Author

Putting this in draft for now, will return to it in a bit

@posthog-bot
Copy link
Contributor

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week. If you want to permanentely keep it open, use the waiting label.

@oliverb123 oliverb123 added waiting Prevents stale-bot from marking the PR as stale. and removed stale labels Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting Prevents stale-bot from marking the PR as stale.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants