-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support prometheus metrics #314
Conversation
Wooow nice work on this!!! Really excited about this work! |
Looks like there're a few flaky tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow this looks like great progress to me (but i lack experience to be confident about being the sole person reviewing this).
test/api_spec.js
Outdated
@@ -13,6 +13,7 @@ describe("API Tests", function () { | |||
var apiPort = port + 1; | |||
var proxy; | |||
var apiUrl = "http://127.0.0.1:" + apiPort + "/api/routes"; | |||
var metricsUrl = "http://127.0.0.1:" + apiPort + "/metrics"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the /metrics endpoint be exposed like any other on the same port? If not, what decides if it listens to localhost only or any incoming requests from other IPs etc?
Do you have a suggestion on how we handle this /metrics endpoint from a security standpoint, where I assume we prefer to not expose it publicly or preferably at least be able to control that somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my perspective, the metrics endpoint should not be public to the internet. However, putting it on the API server together may cause some problems if the API server is protected with API key or/and a client TLS certification, which is not available in your Prometheus server. On the other hand, if you don't use API key nor a client TLS certification, having another server may be too much. So, ideally speaking, we should have 2 options: add the metrics endpoint in the API server, or create another dedicated server for the metrics endpoint.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah hmmm. I think I'm positive towards exposing it via the API server on /metrics, but not having that route require a API token. Or creating a dedicated server for the metrics endpoint...
Not confident about this, hmmm... I know i dislike re-using an API server access token for access to metrics endpoint at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the most flexible option for the future is to have a dedicated server? I dont know how advanced it is to go en for the various options, and reducing technical complexity is also valuable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the most flexible option for the future is to have a dedicated server?
Agree. I think it's the most flexible. We can drop the metrics endpoint in the API server entirely because it's complicated and may confuse users if we have 2 options to serve metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dtaniwaki and @minrk, what do you think?
- Should we expose the metrics on: the proxy server (typically proxying traffic), the proxy-api server (typically controlling traffic routing), and/or a dedicated metrics server (just serving metrics).
- What if any access control is implemented?
- A fixed access token set via an environment variable?
- A way to limit what network traffic it will accept traffic from, does it accept traffic only from localhost, some local network, or all of internet etc?
- What should we aim for in this PR, and what should we aim for in future work?
Currently I think:
- Only a dedicated metrics server
-
- A bonus to have this
- It feels almost required we have this
- Aim for in this PR: a dedicated server that listens to a network interface that can accept traffic from a local network. Future improvements: a fixed access token set via an environment variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm to a large extent thinking about the following situations:
- This software is running on a VM where JupyterHub and Prometheus is also running
- This software is running in a k8s Pod in a k8s cluster, and JupyterHub as well as Prometheus is running in separate pods within in the k8s cluster.
I care about the following:
- To not expose /metrics to everyone that can be routed
- To be able to expose /metrics to prometheus if it is running either on the same machine or in the same k8s cluster
- To protect /metrics with some password dedicated to this purpose even though we don't have TLS communication
I don't care much about the following:
- To protect the /metrics endpoint with some password over a TLS/HTTPS connection using a self-signed cert or provided cert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think running it under it's own configurable port may be the easiest for now. You don't have to worry about authentication, a JupyterHub admin shouldn't have to worry too much about the new port if they're following best practice and are running a firewall, and it can be easily blocked on k8s by a network policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to its own --metrics-port
(and --metrics-ip
) config for its own server, and disabled if unspecified (default).
I updated the code based on your feedback. Would you review it again? |
Wonderful, thank you @dtaniwaki! |
Usage summary by Erik
By explicitly setting
--metrics-port=<some port number>
you will be able to acquire Prometheus formatted metrics fromhttp://<hostname, localhost, or ip>/metrics
.--metrics-ip
is also an flag added and defaults to the permissive0.0.0.0
, which makes it listen to all IPv4 based IPs rather than just traffic from localhost.Fixes #52
I implemented the Prometheus support with the prom-client package, and implemented a metrics endpoint. Please tell me more appropriate metric names if any.
I actually thought we should serve metrics on another port for a different requirement from the API port, which is argued on many systems like etcd-io/etcd#8060, but I didn't do it on this PR to get this feature reviewed first.
Here's the checklist @minrk listed in #52.
Here's the response of the metrics endpoint.
http://localhost:8001/metrics