Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

soroban-rpc: Add /metrics endpoint #496

Closed
paulbellamy opened this issue Mar 10, 2023 · 1 comment · Fixed by #594, #600, #579, #603 or #602
Closed

soroban-rpc: Add /metrics endpoint #496

paulbellamy opened this issue Mar 10, 2023 · 1 comment · Fixed by #594, #600, #579, #603 or #602

Comments

@paulbellamy
Copy link
Contributor

What problem does your feature solve?

As an operator, I would like to have access to system health metrics for my RPC instance, so that I can understand system resource process consumption and available capacity.

What would you like to see?

Metrics should include:

  • General
    • Version running
    • Uptime (comes out of the box w prometheus, I think)
    • Golang stuff like goroutines etc
    • Question: Should/can we include disk/cpu/io/ram? These would normally come from the environment monitoring (vm/k8s)
  • Ingestion
    • Same as horizon
  • txsub
    • Same as horizon
    • Note: We might not need as much as horizon here, given sendTransaction is simpler. But tx type broken down by envelope would be nice.
  • JsonRPC
    • RPS (with route and status)
    • Request duration (with route and status)
  • Logging
    • Lines/entries logged (with severity)
  • DB (similar to horizon)
    • connections
      • open
      • in-use
      • waited-for
      • wait-duration
    • queries per-method
      • mean duration
      • count
    • disk size
      • total
      • just events
      • just ledger entries

We should also provide a document with example alerts and their expected response, so operators know how to configure their own alerts.

What alternatives are there?

@mollykarcher mollykarcher moved this from Backlog to Next Sprint Proposal in Platform Scrum Mar 22, 2023
@sreuland sreuland moved this from Next Sprint Proposal to Current Sprint in Platform Scrum Mar 28, 2023
@tamirms tamirms self-assigned this Apr 6, 2023
@tamirms tamirms moved this from Current Sprint to In Progress in Platform Scrum Apr 6, 2023
@tamirms
Copy link
Contributor

tamirms commented Apr 20, 2023

@paulbellamy I looked into adding metrics for:

  • DB disk size
    • total
    • just events
    • just ledger entries

We can obtain this info for sqlite using the dbstat virtual table. However, our sqlite library does not compile sqlite with the dbstat extension (see mattn/go-sqlite3#886 ). So, I don't think we can obtain these metrics currently. However, we can still setup a metric to monitor the size of the volume containing the sqlite db in kubernetes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment