Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prometheus exporter to the gateway #344

Closed
wants to merge 20 commits into from

Conversation

pcuzner
Copy link
Contributor

@pcuzner pcuzner commented Nov 30, 2023

This PR introduces an embedded prometheus exporter to the gateway, to support external monitoring and potentially alerting.

Signed-off-by: Paul Cuzner [email protected]

New settings control whether the promtheus endpoint
is enabled, and if so what port it's bound to.

enable_prometheus_exporter
prometheus_port

Signed-off-by: Paul Cuzner <[email protected]>
Provides the NVMeOFCollector that handles the
prometheus scrape request, and makes all the required
rpc calls to gather gateways stats.

Signed-off-by: Paul Cuzner <[email protected]>
The main "serve" method can now start the prometheus
exporter based on the config option being set.

Signed-off-by: Paul Cuzner <[email protected]>
@pcuzner pcuzner added the enhancement New feature or request label Nov 30, 2023
@pcuzner
Copy link
Contributor Author

pcuzner commented Nov 30, 2023

Here's an example of what gets returned to prometheus

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 172326.0
python_gc_objects_collected_total{generation="1"} 23546.0
python_gc_objects_collected_total{generation="2"} 868.0
# HELP python_gc_objects_uncollectable_total Uncollectable objects found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 298.0
python_gc_collections_total{generation="1"} 27.0
python_gc_collections_total{generation="2"} 1.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="9",patchlevel="18",version="3.9.18"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.694437376e+09
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 5.6610816e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.70122162981e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 2.83
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 23.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 20480.0
# HELP ceph_nvmeof_spdk_metadata SPDK Version information
# TYPE ceph_nvmeof_spdk_metadata gauge
ceph_nvmeof_spdk_metadata{version="SPDK v23.01.1"} 1.0
# HELP ceph_nvmeof_bdev_capacity_bytes BDEV Capacity
# TYPE ceph_nvmeof_bdev_capacity_bytes gauge
ceph_nvmeof_bdev_capacity_bytes{bdev_name="disk1"} 1.073741824e+010
ceph_nvmeof_bdev_capacity_bytes{bdev_name="disk2"} 2.147483648e+010
ceph_nvmeof_bdev_capacity_bytes{bdev_name="disk3"} 4.294967296e+010
# HELP ceph_nvmeof_bdev_metadata BDEV Metadata
# TYPE ceph_nvmeof_bdev_metadata gauge
ceph_nvmeof_bdev_metadata{bdev_name="disk1",pool_name="rbd",rbd_name="disk1"} 1.0
ceph_nvmeof_bdev_metadata{bdev_name="disk2",pool_name="rbd",rbd_name="disk2"} 1.0
ceph_nvmeof_bdev_metadata{bdev_name="disk3",pool_name="rbd",rbd_name="disk3"} 1.0
# HELP ceph_nvmeof_bdev_reads_completed_total Total number of read operations completed
# TYPE ceph_nvmeof_bdev_reads_completed_total counter
ceph_nvmeof_bdev_reads_completed_total{bdev_name="disk1"} 2.0
ceph_nvmeof_bdev_reads_completed_total{bdev_name="disk2"} 2.0
ceph_nvmeof_bdev_reads_completed_total{bdev_name="disk3"} 2.0
# HELP ceph_nvmeof_bdev_writes_completed_total Total number of write operations completed
# TYPE ceph_nvmeof_bdev_writes_completed_total counter
ceph_nvmeof_bdev_writes_completed_total{bdev_name="disk1"} 0.0
ceph_nvmeof_bdev_writes_completed_total{bdev_name="disk2"} 0.0
ceph_nvmeof_bdev_writes_completed_total{bdev_name="disk3"} 0.0
# HELP ceph_nvmeof_bdev_read_bytes_total Total number of bytes read successfully
# TYPE ceph_nvmeof_bdev_read_bytes_total counter
ceph_nvmeof_bdev_read_bytes_total{bdev_name="disk1"} 36864.0
ceph_nvmeof_bdev_read_bytes_total{bdev_name="disk2"} 36864.0
ceph_nvmeof_bdev_read_bytes_total{bdev_name="disk3"} 36864.0
# HELP ceph_nvmeof_bdev_written_bytes_total Total number of bytes written successfully
# TYPE ceph_nvmeof_bdev_written_bytes_total counter
ceph_nvmeof_bdev_written_bytes_total{bdev_name="disk1"} 0.0
ceph_nvmeof_bdev_written_bytes_total{bdev_name="disk2"} 0.0
ceph_nvmeof_bdev_written_bytes_total{bdev_name="disk3"} 0.0
# HELP ceph_nvmeof_bdev_read_seconds_total Total time spent servicing READ I/O
# TYPE ceph_nvmeof_bdev_read_seconds_total counter
ceph_nvmeof_bdev_read_seconds_total{bdev_name="disk1"} 0.0011596725
ceph_nvmeof_bdev_read_seconds_total{bdev_name="disk2"} 0.0003732117857142857
ceph_nvmeof_bdev_read_seconds_total{bdev_name="disk3"} 0.0004869682142857143
# HELP ceph_nvmeof_bdev_write_seconds_total Total time spent servicing WRITE I/O
# TYPE ceph_nvmeof_bdev_write_seconds_total counter
ceph_nvmeof_bdev_write_seconds_total{bdev_name="disk1"} 0.0
ceph_nvmeof_bdev_write_seconds_total{bdev_name="disk2"} 0.0
ceph_nvmeof_bdev_write_seconds_total{bdev_name="disk3"} 0.0
# HELP ceph_nvmeof_reactor_seconds_total time reactor thread active with I/O
# TYPE ceph_nvmeof_reactor_seconds_total counter
ceph_nvmeof_reactor_seconds_total{mode="busy",name="nvmf_tgt_poll_group_0"} 0.013692368928571428
ceph_nvmeof_reactor_seconds_total{mode="idle",name="nvmf_tgt_poll_group_0"} 617.5284076210714
# HELP ceph_nvmeof_subsystem_metadata Metadata describing the subsystem configuration
# TYPE ceph_nvmeof_subsystem_metadata gauge
ceph_nvmeof_subsystem_metadata{allow_any_host="no",model_number="SPDK bdev Controller",nqn="nqn.2016-06.io.spdk:cnode1",serial_number="SPDK00000000000001"} 1.0
# HELP ceph_nvmeof_subsystem_listener_count Number of listeners addresses used by the subsystem
# TYPE ceph_nvmeof_subsystem_listener_count gauge
ceph_nvmeof_subsystem_listener_count{nqn="nqn.2016-06.io.spdk:cnode1"} 1.0
# HELP ceph_nvmeof_subsystem_host_count Number of hosts defined to the subsystem
# TYPE ceph_nvmeof_subsystem_host_count gauge
ceph_nvmeof_subsystem_host_count{nqn="nqn.2016-06.io.spdk:cnode1"} 1.0
# HELP ceph_nvmeof_subsystem_namespace_limit Maximum namespaces supported
# TYPE ceph_nvmeof_subsystem_namespace_limit gauge
ceph_nvmeof_subsystem_namespace_limit{nqn="nqn.2016-06.io.spdk:cnode1"} 32.0
# HELP ceph_nvmeof_subsystem_namespace_metadata Namespace information for the subsystem
# TYPE ceph_nvmeof_subsystem_namespace_metadata gauge
ceph_nvmeof_subsystem_namespace_metadata{bdev_name="disk1",name="disk1",nqn="nqn.2016-06.io.spdk:cnode1",nsid="1"} 1.0
ceph_nvmeof_subsystem_namespace_metadata{bdev_name="disk2",name="disk2",nqn="nqn.2016-06.io.spdk:cnode1",nsid="2"} 1.0
ceph_nvmeof_subsystem_namespace_metadata{bdev_name="disk3",name="disk3",nqn="nqn.2016-06.io.spdk:cnode1",nsid="3"} 1.0

@pcuzner pcuzner requested review from gbregman and epuertat November 30, 2023 20:02
Copy link
Contributor

@gbregman gbregman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me. But, as I have no idea about Prometheus I'm not sure I'm the right person to review.

2 questions:

  • Shouldn't we have a stop function in the code?
  • Should we add a test to github for this?

Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see the Prometheus client working! Great work! I left a few comments over there.

pyproject.toml Outdated Show resolved Hide resolved
control/prometheus.py Outdated Show resolved Hide resolved
control/prometheus.py Outdated Show resolved Hide resolved
control/prometheus.py Outdated Show resolved Hide resolved
control/prometheus.py Show resolved Hide resolved
control/prometheus.py Outdated Show resolved Hide resolved
control/prometheus.py Outdated Show resolved Hide resolved
control/prometheus.py Outdated Show resolved Hide resolved
Requires at least 0.19 to satisfy the requirement of a
https endpoint.

Signed-off-by: Paul Cuzner <[email protected]>
The prometheus_bdev_pools setting is a comma
separated string of those rbd pools that should emit
bdev metrics. If this setting is not defined, metrics for
all bdevs will be emitted.

Signed-off-by: Paul Cuzner <[email protected]>
The following changes have been made;
- enable endpoint to be https (requiring 0.19 or above)
- use a prometheus_bdev_pools parameter to govern
  metrics returned
- use infometricfamily for simple static metric
- additional log messages added for debug purposes

Signed-off-by: Paul Cuzner <[email protected]>
@pcuzner pcuzner requested a review from epuertat December 4, 2023 01:50
@pcuzner
Copy link
Contributor Author

pcuzner commented Dec 4, 2023

Looks OK to me. But, as I have no idea about Prometheus I'm not sure I'm the right person to review.

2 questions:

* Shouldn't we have a stop function in the code?

* Should we add a test to github for this?

@gbregman the prometheus endpoint runs as a daemon thread and doesn't maintain state, so shutdown is generally not a consideration.

As far as a github CI test is concerned, I think that makes sense. Once we get this in, I'm happy to help with that.

@caroav
Copy link
Collaborator

caroav commented Dec 7, 2023

@pcuzner, some comments and questions:

  • Why do we need to export the static configuration, i.e : namespace metdata, namespace limit, host count, listener count, subsystem metadata, spdk version info, bdev_metadata, bdev_capcity. I think that this should be available only from the CLI (if there is anything missing in the CLI we will add it there).
  • We should not talk the bdev language, we should use namespace. So can we change bdev to namespace?
  • I'm also not sure why do we need to export IO statistics per namespace. I think that it would better to have:
  1. Read/Write IOPs total (avg per few seconds) per GW and per Subsystem (I understand that the calculation of avg will probably be done by Grafana?)
  2. Read/Write IOPs latency (avg per few seconds) per GW and per Subsystem (I understand that the calculation of avg will probably be done by Grafana?)

@oritwas WDYS?

@pcuzner
Copy link
Contributor Author

pcuzner commented Dec 7, 2023

@caroav to answer your questions;

  • to get the any iostats, you have to process bdev_get_iostat - so you already have per bdev information
  • unless you track the bdev, you can't relate a namespace back to a specific rbd pool. if you're interested in troubleshooting I'd recommend this.
  • nsid's and names are unique to the controller, not unique to the environment - bdevs are unique - and unique keys are good!
  • only tracking gateway/subsystem performance offers no troubleshooting capability when tracking datastore performance issue or trending capacity or performance over time
  • without metadata like capacity, you can't provide simple information like how much data is exposed through the gateway - how much today, next week, next month...
  • with metadata you can enable a) richer alerts b) queries to merge metrics with metadata for more meaningful visualisations c) subsystem level stats can be derived in prometheus from the namespace metadata merged with the bdev io stats - either in the promql query, or by adding recording rules to tell the prometheus server to generate new stats from your raw data at ingest time

.env Outdated Show resolved Hide resolved
control/prometheus.py Outdated Show resolved Hide resolved
control/prometheus.py Outdated Show resolved Hide resolved
The exporter now runs a background thread to collect
the spdk stats to reduce the effect of DoS attacks. Also,
the spdk rpc calls are now timed and returned within the
prometheus payload

Signed-off-by: Paul Cuzner <[email protected]>
@pcuzner pcuzner requested review from epuertat and idryomov December 14, 2023 23:16
As per comments from Aviv.

Signed-off-by: Paul Cuzner <[email protected]>
Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good! Thanks for addressing my comments.

I left a couple of cosmetic comments, but the only thing I'm more concerned about is the HTTP fallkback when certificates are not found.

control/prometheus.py Show resolved Hide resolved
control/prometheus.py Outdated Show resolved Hide resolved
control/prometheus.py Outdated Show resolved Hide resolved
@pcuzner pcuzner requested a review from epuertat December 20, 2023 02:45
@pcuzner
Copy link
Contributor Author

pcuzner commented Dec 20, 2023

Also note that I check https mode by adding a crt and key to the container, and it switched to https mode (confirmed with curl)

@pcuzner
Copy link
Contributor Author

pcuzner commented Dec 20, 2023

I ran tests for the updated way to handle SSL on/off, scenarios and log messages are below;

SSL requested, but keys not found

nvmeof_1                | INFO:control.server:Prometheus endpoint is enabled
nvmeof_1                | ERROR:control.prometheus:Unable to start prometheus exporter - missing cert/key file(s)

SSL requested and keys found

nvmeof_1                | INFO:control.server:Prometheus endpoint is enabled
nvmeof_1                | INFO:control.state:Connected to Ceph with version "18.2.0 (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef (stable)"
nvmeof_1                | INFO:control.discovery:discovery addr: 0.0.0.0 port: 8009
nvmeof_1                | INFO:control.prometheus:Prometheus exporter running in https mode, listening on port 10008
nvmeof_1                | INFO:control.prometheus:Stats for all bdevs will be provided
nvmeof_1                | INFO:control.prometheus:Starting SPDK collector thread, refreshing every 10 secs
nvmeof_1                | DEBUG:control.prometheus:Processing prometheus scrape request
nvmeof_1                | DEBUG:control.prometheus:Refreshing stats from SPDK
nvmeof_1                | DEBUG:control.prometheus:Stats refresh completed in 0.032 secs.

SSL disabled by config

nvmeof_1                | INFO:control.server:Prometheus endpoint is enabled
nvmeof_1                | INFO:control.prometheus:Prometheus exporter running in http mode, listening on port 10008
nvmeof_1                | INFO:control.prometheus:Stats for all bdevs will be provided
nvmeof_1                | INFO:control.prometheus:Starting SPDK collector thread, refreshing every 10 secs
nvmeof_1                | DEBUG:control.prometheus:Processing prometheus scrape request

exporter disabled

nvmeof_1                | INFO:control.server:Prometheus endpoint is disabled. To enable, set the config option 'enable_prometheus_exporter = True'
nvmeof_1                | INFO:control.state:Connected to Ceph with version "18.2.0 (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef (stable)"

@pcuzner
Copy link
Contributor Author

pcuzner commented Dec 20, 2023

@epuertat please review SSL changes

@pcuzner
Copy link
Contributor Author

pcuzner commented Dec 20, 2023

do not merge - once outstanding concerned are addressed I'll squash the commits first.

Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks a lot @pcuzner for the work done!

Comment on lines +60 to +75
if ssl:
cert_filepath = config.get('mtls', 'server_cert')
key_filepath = config.get('mtls', 'server_key')

if os.path.exists(cert_filepath) and os.path.exists(key_filepath):
httpd_ok = start_httpd(port=port, certfile=cert_filepath, keyfile=key_filepath)
else:
httpd_ok = False
logger.error("Unable to start prometheus exporter - missing cert/key file(s)")
else:
# SSL mode explicitly disabled by config option
httpd_ok = start_httpd(port=port)

if httpd_ok:
logger.info(f"Prometheus exporter running in {mode} mode, listening on port {port}")
REGISTRY.register(NVMeOFCollector(spdk_rpc_client, config))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this kind of flows is where exceptions work best:

Suggested change
if ssl:
cert_filepath = config.get('mtls', 'server_cert')
key_filepath = config.get('mtls', 'server_key')
if os.path.exists(cert_filepath) and os.path.exists(key_filepath):
httpd_ok = start_httpd(port=port, certfile=cert_filepath, keyfile=key_filepath)
else:
httpd_ok = False
logger.error("Unable to start prometheus exporter - missing cert/key file(s)")
else:
# SSL mode explicitly disabled by config option
httpd_ok = start_httpd(port=port)
if httpd_ok:
logger.info(f"Prometheus exporter running in {mode} mode, listening on port {port}")
REGISTRY.register(NVMeOFCollector(spdk_rpc_client, config))
try:
if ssl:
# No need the check for certificates, this method already raises an Exception if not found
start_http_server(
port=port,
certfile=config.get('mtls', 'server_cert'),
keyfile=config.get('mtls', 'server_key')
)
else:
# SSL mode explicitly disabled by config option
httpd_ok = start_httpd(port=port)
except Exception:
# logger.exception() automatically attaches the Exception raised
logger.exception("Failed to start the prometheus http server")
else:
logger.info(f"Prometheus exporter running in {mode} mode, listening on port {port}")
REGISTRY.register(NVMeOFCollector(spdk_rpc_client, config))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I check for files to give more meaningful error messages.


def start_httpd(**kwargs):
"""Start the prometheus http endpoint, catching any exception"""
logger = logging.getLogger(__name__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another nit here: it's a common practice to get the logger at the top of the file, rather than inside every method, as __name__ is constant throughout a file, so all these calls will get the same singleton:

import logging
logger = logging.getLogger(__name__)



def timer(method):
def call(self, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, but with decorators one should do this, so that the returned method has all the metadata/types/signature/docstring/... of the decorated method:

Suggested change
def call(self, *args, **kwargs):
@functools.wraps(method)
def call(self, *args, **kwargs):

@pcuzner
Copy link
Contributor Author

pcuzner commented Dec 21, 2023

It seems at every change, I'm introducing further issues - with all these rewrites, it would have been more expedient for others to provide the feature 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants