Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add metrics to track queue sizes and add operations - #AES-361 #3282

Merged
merged 5 commits into from
Oct 8, 2024

Conversation

gvelez17
Copy link
Contributor

@gvelez17 gvelez17 commented Oct 7, 2024

Description

From the datadog analysis, it looks like the gitcoin out of memory errors are related to queue growth specifically with the p-queue library.

This adds metrics to all instances of PQueue to track the size and add operations.

How Has This Been Tested?

Describe the tests that you ran to verify your changes. Provide instructions for reproduction.

  • Unit tests

PR checklist

Before submitting this PR, please make sure:

  • I have tagged the relevant reviewers and interested parties
  • I have updated the READMEs of affected packages
  • I have made corresponding changes to the documentation

References:

Please list relevant documentation (e.g. tech specs, articles, related work etc.) relevant to this change, and note if the documentation has been updated.

Copy link

linear bot commented Oct 7, 2024

@gvelez17
Copy link
Contributor Author

gvelez17 commented Oct 7, 2024

side node, when i build on my laptop i get a unrelated error

src/ceramic-cli-utils.ts(530,5): error TS2322: Type 'import("/Users/gv/js-ceramic/packages/cli/node_modules/dids/dist/did").DID' is not assignable to type 'import("/Users/gv/js-ceramic/node_modules/dids/dist/did").DID'.
src/ceramic-daemon.ts(333,5): error TS2322: Type 'import("/Users/gv/js-ceramic/packages/cli/node_modules/dids/dist/did").DID' is not assignable to type 'import("/Users/gv/js-ceramic/node_modules/dids/dist/did").DID'.
  Types have separate declarations of a private property '_client'.
lerna ERR! npm run build exited 2 in '@ceramicnetwork/cli'

after running npm install, not sure why this is, should be unrelated.

Comment on lines 59 to 60
Metrics.observe(TASK_QUEUE_SIZE, this.#pq.size)
Metrics.observe(TASK_QUEUE_SIZE_PENDING, this.#pq.pending)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These might be less useful because there will be many TaskQueue instances that we won't be able to differentiate since they're unnamed.

Looks like they're instantiated through NamedTaskQueue, so the metrics there should be sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, removed!

Comment on lines +192 to +193
Metrics.count(LOAD_S3_QUEUE_ADD, 1)
Metrics.observe(LOAD_S3_QUEUE_SIZE, this.#loadingLimit.size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a get call, should we move these lines to the put call? I don't think the S3 store is used anymore, but adding these doesn't hurt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well - the way i read this, for whatever reason the .get of the s3 storage is adding a leveldb get call into the queue. so for monitoring the queue, i want it here.

The put doesn't add to the queue, it just does the put synchronously it seems

@gvelez17 gvelez17 merged commit 60a7a9e into develop Oct 8, 2024
7 checks passed
@gvelez17 gvelez17 deleted the feat/observe-queue-sizes branch October 8, 2024 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants