-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose network metrics in system_metric
API
#3386
Comments
Part of the issue here is that I don't believe we've settled on a good story for exposing metrics in general. The disk-related endpoints are one potential path, where we do something like create a specific endpoint for each of the metrics we'd like to expose. On the other end of the spectrum, one could imagine a pretty general way to all the metrics we have; a strawman would be something like providing full SQL queries for selecting data from the timeseries. Those two have lots of tradeoffs, some of which were discussed in RFD 304. There was no general resolution, and without someone to focus on this it was hard to drive a prototype forward. |
I am inclined to keep punting on a per-metric basis until we can get someone working on this problem full-time. Seeing how we need to query these metrics for our own purposes is useful input for designing the better system. |
@bnaecker - Thanks for pointing me back to RFD 304. I've read it previously but definitely failed to recall some of the concerns raised there when filing this ticket. This may be one of the cases that can benefit from having the early version of metrics API classified as "experimental". I think there is value in continuing to expand the I just think that it's worse to keep metrics in limbo till we know what the customer needs. With rack-level objects not owned by end-users (e.g. datalinks, sleds, physical disks), we ourselves are going to be the first customer. @david-crespo - Given the pending design decisions, a per-metric basis implementation sounds good. If the implementation is going to be more costly than expected, please bring it up for discussion. |
I think this might be orthogonal to what you're proposing, but if the goal is for Oxide developers to consume metrics, then you can currently read any data available in ClickHouse with the Is that what you're looking for? Or is the goal definitely to expose things in the API and console? |
I don't intend to have more metrics exposed in the console. It sounds like the |
Implemented in #5273 |
The API endpoint currently exposes only metrics under the target named
collection_target
, i.e.virtual_disk_space_provisioned
,cpus_provisioned
,ram_provisioned
. It'll be good to loosen the restriction so that the same API can be used for querying other system-level metrics, e.g. the recently added networking metrics fordata_link
, without the need to update the API every time something new is added.The main goal is to allow easier access to the metrics when engineers work with customers to get more debugging data.
We might also want to bring back the timeseries_schema endpoint (it was taken out some time back) so that user can see what metrics are available. I don't recall what the response looked like when it was there but it's probably something like this:
cc @david-crespo
The text was updated successfully, but these errors were encountered: