Update Ceph module to support new API #7723

ruflin · 2018-07-25T06:55:51Z

ceph-rest-api is replaced by ceph-mgr in newer releases http://docs.ceph.com/docs/luminous/mgr/restful/# See #7661 (comment) for additional details.

The text was updated successfully, but these errors were encountered:

mtojek · 2020-02-04T14:00:35Z

@sorantis

If we want to switch to ceph-mgr, it's worth considering a Prometheus plugin. See: https://docs.ceph.com/docs/master/mgr/prometheus/
It provides pool, OSD metadata series, disc statistics. It's supported since the Luminous release.

If we agree to switch to Prometheus endpoint, I need some guidance on deprecating existing implementation

mtojek · 2020-02-05T10:54:07Z

~~At the moment I will proceed with a new metricset cephmgr that with use Prometheus metrics endpoint.~~
(see below)

sorantis · 2020-02-05T10:54:47Z

The existing implementation should be still valid for older versions of Ceph. Newer versions that have ceph-mgr could be handled by a separate metricset.
Using Prometheus here is an attractive option, but I'd stick to native APIs wherever possible for several reasons:

The Prometheus module will be going through several breaking changes that will impact all light modules based on it, so I'd refrain from adding to the list.
The Prometheus endpoints look like an extra capability that require the user to enable them manually. In some cases that might mean changing deployment templates, adding ports to firewall rules, etc.

My recommendation would be to use native APIs wherever possible.

mtojek · 2020-02-05T11:09:10Z

It seems that we responded in the same time...

according to what we discussed offline, let's try to stick to native APIs as Prometheus module is not enabled by default.

sorantis · 2020-02-05T12:06:29Z

After talking more with @mtojek about this, it seems that the right way would be to use the mgr's restful module instead of prometheus due to the points listed above, but also due to security. Prometheus endpoints at the moment don't support secure communication, which means that in case of building implementation on the prometheus module, for secure communication Metricbeat has to be deployed locally and configured with TLS. With restful there’s no such limitation - Metricbeat can be deployed on another node and restful can be configured to use TLS.

mtojek · 2020-02-05T19:48:33Z

@sorantis
I booted up a demo Ceph cluster to review restful resources. To be honest, most of data exposed via endpoint is rather configuration than exact metrics.

Here are some of them:

/mon:

[
    {
        "addr": "172.30.0.2:3300/0",
        "in_quorum": true,
        "leader": true,
        "name": "edcb751e8aa1",
        "public_addr": "172.30.0.2:3300/0",
        "public_addrs": {
            "addrvec": [
                {
                    "addr": "172.30.0.2:3300",
                    "nonce": 0,
                    "type": "v2"
                }
            ]
        },
        "rank": 0,
        "server": "edcb751e8aa1"
    }
]

/osd:

[
    {
        "cluster_addr": "172.30.0.2:6803/186",
        "cluster_addrs": {
            "addrvec": [
                {
                    "addr": "172.30.0.2:6802",
                    "nonce": 186,
                    "type": "v2"
                },
                {
                    "addr": "172.30.0.2:6803",
                    "nonce": 186,
                    "type": "v1"
                }
            ]
        },
        "down_at": 20,
        "heartbeat_back_addr": "172.30.0.2:6807/186",
        "heartbeat_back_addrs": {
            "addrvec": [
                {
                    "addr": "172.30.0.2:6806",
                    "nonce": 186,
                    "type": "v2"
                },
                {
                    "addr": "172.30.0.2:6807",
                    "nonce": 186,
                    "type": "v1"
                }
            ]
        },
        "heartbeat_front_addr": "172.30.0.2:6805/186",
        "heartbeat_front_addrs": {
            "addrvec": [
                {
                    "addr": "172.30.0.2:6804",
                    "nonce": 186,
                    "type": "v2"
                },
                {
                    "addr": "172.30.0.2:6805",
                    "nonce": 186,
                    "type": "v1"
                }
            ]
        },
        "in": 1,
        "last_clean_begin": 4,
        "last_clean_end": 18,
        "lost_at": 0,
        "osd": 0,
        "pools": [
            1,
            2,
            3,
            4,
            5,
            6,
            7,
            8
        ],
        "primary_affinity": 1.0,
        "public_addr": "172.30.0.2:6801/186",
        "public_addrs": {
            "addrvec": [
                {
                    "addr": "172.30.0.2:6800",
                    "nonce": 186,
                    "type": "v2"
                },
                {
                    "addr": "172.30.0.2:6801",
                    "nonce": 186,
                    "type": "v1"
                }
            ]
        },
        "reweight": 1.0,
        "server": "edcb751e8aa1",
        "state": [
            "exists",
            "up"
        ],
        "up": 1,
        "up_from": 21,
        "up_thru": 21,
        "uuid": "eb1c8d6d-70c2-4511-a1b8-e9e7e5f624aa",
        "valid_commands": [
            "scrub",
            "deep-scrub",
            "repair"
        ],
        "weight": 1.0
    }
]

/pool:

[
    {
        "application_metadata": {},
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:09.277269",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "6",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 1,
        "pool_name": "rbd",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "cephfs": {
                "data": "cephfs"
            }
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:10.354727",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "7",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 2,
        "pool_name": "cephfs_data",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "cephfs": {
                "metadata": "cephfs"
            }
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:11.310873",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "8",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {
            "pg_autoscale_bias": 4.0,
            "pg_num_min": 16,
            "recovery_priority": 5
        },
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 3,
        "pool_name": "cephfs_metadata",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "rgw": {}
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:13.193509",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "10",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 4,
        "pool_name": ".rgw.root",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "rgw": {}
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:14.554436",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "12",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 5,
        "pool_name": "default.rgw.control",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "rgw": {}
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:16.544549",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "14",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 6,
        "pool_name": "default.rgw.meta",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "rgw": {}
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:18.505341",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "16",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 7,
        "pool_name": "default.rgw.log",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "rgw": {}
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:20.965857",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "18",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 8,
        "pool_name": "default.rgw.buckets.index",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    }
]

/server:

[
    {
        "ceph_version": "ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)",
        "hostname": "",
        "services": [
            {
                "id": "14116",
                "type": "rbd-mirror"
            }
        ]
    },
    {
        "ceph_version": "ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)",
        "hostname": "edcb751e8aa1",
        "services": [
            {
                "id": "demo",
                "type": "mds"
            },
            {
                "id": "edcb751e8aa1",
                "type": "mgr"
            },
            {
                "id": "edcb751e8aa1",
                "type": "mon"
            },
            {
                "id": "0",
                "type": "osd"
            },
            {
                "id": "edcb751e8aa1",
                "type": "rgw"
            },
            {
                "id": "edcb751e8aa1",
                "type": "rgw-nfs"
            }
        ]
    }
]

I'm afraid it might be hard for end-user to conclude the cluster health state and available storage.

Apart from that, there is one resource that gives you a valid (but also too deep) information is /perf:

Here is a sample:

{
    "mds.demo": {
        "mds.caps": {
            "description": "Capabilities",
            "nick": "caps",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.dir_commit": {
            "description": "Directory commit",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.dir_fetch": {
            "description": "Directory fetch",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 12
        },
        "mds.dir_merge": {
            "description": "Directory merge",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.dir_split": {
            "description": "Directory split",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.exported_inodes": {
            "description": "Exported inodes",
            "nick": "exi",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.forward": {
            "description": "Forwarding request",
            "nick": "fwd",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.imported_inodes": {
            "description": "Imported inodes",
            "nick": "imi",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.inode_max": {
            "description": "Max inodes, cache size",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 2147483647
        },
        "mds.inodes": {
            "description": "Inodes",
            "nick": "inos",
            "priority": 10,
            "type": 2,
            "units": 1,
            "value": 10
        },
        "mds.inodes_expired": {
            "description": "Inodes expired",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.inodes_pinned": {
            "description": "Inodes pinned",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 10
        },
        "mds.inodes_with_caps": {
            "description": "Inodes with capabilities",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.load_cent": {
            "description": "Load per cent",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.openino_dir_fetch": {
            "description": "OpenIno incomplete directory fetchings",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.reply_latency": {
            "count": 0,
            "description": "Reply latency",
            "nick": "rlat",
            "priority": 10,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds.request": {
            "description": "Requests",
            "nick": "req",
            "priority": 10,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.root_rbytes": {
            "description": "root inode rbytes",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.root_rfiles": {
            "description": "root inode rfiles",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.root_rsnaps": {
            "description": "root inode rsnaps",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.subtrees": {
            "description": "Subtrees",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 2
        },
        "mds_cache.ireq_enqueue_scrub": {
            "description": "Internal Request type enqueue scrub",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.ireq_exportdir": {
            "description": "Internal Request type export dir",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.ireq_flush": {
            "description": "Internal Request type flush",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.ireq_fragmentdir": {
            "description": "Internal Request type fragmentdir",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.ireq_fragstats": {
            "description": "Internal Request type frag stats",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.ireq_inodestats": {
            "description": "Internal Request type inode stats",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_recovering_enqueued": {
            "description": "Files waiting for recovery",
            "nick": "recy",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_recovering_prioritized": {
            "description": "Files waiting for recovery with elevated priority",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_recovering_processing": {
            "description": "Files currently being recovered",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_strays": {
            "description": "Stray dentries",
            "nick": "stry",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_strays_delayed": {
            "description": "Stray dentries delayed",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_strays_enqueuing": {
            "description": "Stray dentries enqueuing for purge",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.recovery_completed": {
            "description": "File recoveries completed",
            "nick": "recd",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.recovery_started": {
            "description": "File recoveries started",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.strays_created": {
            "description": "Stray dentries created",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.strays_enqueued": {
            "description": "Stray dentries enqueued for purge",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.strays_migrated": {
            "description": "Stray dentries migrated",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.strays_reintegrated": {
            "description": "Stray dentries reintegrated",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.ev": {
            "description": "Events",
            "nick": "evts",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_log.evadd": {
            "description": "Events submitted",
            "nick": "subm",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.evex": {
            "description": "Total expired events",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.evexd": {
            "description": "Current expired events",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_log.evexg": {
            "description": "Expiring events",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_log.evtrm": {
            "description": "Trimmed events",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.jlat": {
            "count": 0,
            "description": "Journaler flush latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_log.replayed": {
            "description": "Events replayed",
            "nick": "repl",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 1
        },
        "mds_log.seg": {
            "description": "Segments",
            "nick": "segs",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 1
        },
        "mds_log.segadd": {
            "description": "Segments added",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.segex": {
            "description": "Total expired segments",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.segexd": {
            "description": "Current expired segments",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_log.segexg": {
            "description": "Expiring segments",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_log.segtrm": {
            "description": "Trimmed segments",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_mem.cap": {
            "description": "Capabilities",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_mem.cap+": {
            "description": "Capabilities added",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_mem.cap-": {
            "description": "Capabilities removed",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_mem.dir": {
            "description": "Directories",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 12
        },
        "mds_mem.dir+": {
            "description": "Directories opened",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 12
        },
        "mds_mem.dir-": {
            "description": "Directories closed",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_mem.dn": {
            "description": "Dentries",
            "nick": "dn",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 10
        },
        "mds_mem.dn+": {
            "description": "Dentries opened",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 10
        },
        "mds_mem.dn-": {
            "description": "Dentries closed",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_mem.heap": {
            "description": "Heap size",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 332028
        },
        "mds_mem.ino": {
            "description": "Inodes",
            "nick": "ino",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 13
        },
        "mds_mem.ino+": {
            "description": "Inodes opened",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 13
        },
        "mds_mem.ino-": {
            "description": "Inodes closed",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_server.cap_revoke_eviction": {
            "description": "Cap Revoke Client Eviction",
            "nick": "cre",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_server.handle_client_request": {
            "description": "Client requests",
            "nick": "hcr",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_server.handle_client_session": {
            "description": "Client session messages",
            "nick": "hcs",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 40
        },
        "mds_server.handle_slave_request": {
            "description": "Slave requests",
            "nick": "hsr",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_server.req_create_latency": {
            "count": 0,
            "description": "Request type create latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_getattr_latency": {
            "count": 0,
            "description": "Request type get attribute latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_getfilelock_latency": {
            "count": 0,
            "description": "Request type get file lock latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_link_latency": {
            "count": 0,
            "description": "Request type link latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookup_latency": {
            "count": 0,
            "description": "Request type lookup latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookuphash_latency": {
            "count": 0,
            "description": "Request type lookup hash of inode latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookupino_latency": {
            "count": 0,
            "description": "Request type lookup inode latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookupname_latency": {
            "count": 0,
            "description": "Request type lookup name latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookupparent_latency": {
            "count": 0,
            "description": "Request type lookup parent latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookupsnap_latency": {
            "count": 0,
            "description": "Request type lookup snapshot latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lssnap_latency": {
            "count": 0,
            "description": "Request type list snapshot latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_mkdir_latency": {
            "count": 0,
            "description": "Request type make directory latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_mknod_latency": {
            "count": 0,
            "description": "Request type make node latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_mksnap_latency": {
            "count": 0,
            "description": "Request type make snapshot latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
...

mtojek · 2020-02-06T11:56:31Z

Just updating the thread. We had a discussion with @sorantis and will go with /request API resource which internally calls and returns same output as ceph command (e.g. ceph status, ceph df).

Sample call/output:

>>> command='df'
>>> requests.post('https://host:port/request?wait=1', json={'prefix': command, 'format': 'json'}, auth=("demo", "password")).json()
{u'waiting': [], u'has_failed': False, u'state': u'success', u'is_waiting': False, u'running': [], u'failed': [], u'finished': [{u'outb': u'{"stats":{"total_bytes":10737418240,"total_avail_bytes":9621471232,"total_used_bytes":42205184,"total_used_raw_bytes":1115947008,"total_used_raw_ratio":0.10393066704273224,"num_osds":1,"num_per_pool_osds":1},"stats_by_class":{},"pools":[{"name":"rbd","id":1,"stats":{"stored":0,"objects":0,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}},{"name":"cephfs_data","id":2,"stats":{"stored":0,"objects":0,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}},{"name":"cephfs_metadata","id":3,"stats":{"stored":2286,"objects":22,"kb_used":512,"bytes_used":524288,"percent_used":5.7708399253897369e-05,"max_avail":9084600320}},{"name":".rgw.root","id":4,"stats":{"stored":2398,"objects":6,"kb_used":384,"bytes_used":393216,"percent_used":4.3281925172777846e-05,"max_avail":9084600320}},{"name":"default.rgw.control","id":5,"stats":{"stored":0,"objects":8,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}},{"name":"default.rgw.meta","id":6,"stats":{"stored":1173,"objects":7,"kb_used":384,"bytes_used":393216,"percent_used":4.3281925172777846e-05,"max_avail":9084600320}},{"name":"default.rgw.log","id":7,"stats":{"stored":0,"objects":176,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}},{"name":"default.rgw.buckets.index","id":8,"stats":{"stored":0,"objects":2,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}},{"name":"default.rgw.buckets.data","id":9,"stats":{"stored":37122728,"objects":21,"kb_used":36480,"bytes_used":37355520,"percent_used":0.0040951217524707317,"max_avail":9084600320}},{"name":"default.rgw.buckets.non-ec","id":10,"stats":{"stored":0,"objects":0,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}}]}\n', u'outs': u'', u'command': u'df format=json'}], u'is_finished': True, u'id': u'140124650075600'}

mtojek · 2020-02-07T12:04:24Z

I'm working on the following metricsets (metricset ~ ceph command):

mgr_cluster_health ~ ceph status
mgr_cluster_disk ~ ceph df
mgr_osd_disk ~ ceph osd df
mgr_osd_pool_stats ~ ceph osd pool stats
mgr_osd_perf ~ ceph osd perf
mgr_osd_tree ~ ceph osd tree

The mgr prefix suggests that these metricsets are compatible with Ceph Manager Daemon (https://docs.ceph.com/docs/master/mgr/).

mtojek · 2020-02-19T14:07:58Z

Module updated to use new API. PRs merged. Resolving.

toha70 · 2020-02-26T15:06:00Z

Hi @mtojek : I'm looking at the cherry-pick for #16254 and I can't find the changes for the mgr_osd_disk.
/go/src/github.com/elastic/beats/metricbeat/module/ceph# ls -lrt | grep mgr_
drwxr-xr-x 3 root root 137 Feb 26 14:57 mgr_cluster_disk
drwxr-xr-x 3 root root 125 Feb 26 14:57 mgr_osd_perf
drwxr-xr-x 3 root root 143 Feb 26 14:57 mgr_cluster_health
drwxr-xr-x 3 root root 143 Feb 26 14:57 mgr_osd_pool_stats
drwxr-xr-x 3 root root 128 Feb 26 14:57 mgr_pool_disk
drwxr-xr-x 3 root root 125 Feb 26 14:57 mgr_osd_tree

All the other metricset are present except for the mgr_osd_disk. should we fall back to osd_df?

mtojek · 2020-02-26T15:10:06Z

Hi @mtojek : I'm looking at the cherry-pick for #16254 and I can't find the changes for the mgr_osd_disk.
/go/src/github.com/elastic/beats/metricbeat/module/ceph# ls -lrt | grep mgr_
drwxr-xr-x 3 root root 137 Feb 26 14:57 mgr_cluster_disk
drwxr-xr-x 3 root root 125 Feb 26 14:57 mgr_osd_perf
drwxr-xr-x 3 root root 143 Feb 26 14:57 mgr_cluster_health
drwxr-xr-x 3 root root 143 Feb 26 14:57 mgr_osd_pool_stats
drwxr-xr-x 3 root root 128 Feb 26 14:57 mgr_pool_disk
drwxr-xr-x 3 root root 125 Feb 26 14:57 mgr_osd_tree

All the other metricset are present except for the mgr_osd_disk. should we fall back to osd_df?

Hi! It's renamed to mgr_pool_disk (#16254 (comment)).

toha70 · 2020-02-26T15:28:42Z

Thank you @mtojek. I must have missed this comment :).

epuertat · 2021-06-08T12:11:35Z

Hi folks, just for you to know: at Ceph project we're planning to deprecate soon the restful API you're relying on here.

The alternatives would either be the fine-grained Ceph Dashboard REST API (more of a management API, so probably not the best for you) or the Prometheus exporter (which gives you all the metrics in a single shot).

sorantis · 2021-06-08T12:20:07Z

@epuertat thanks for letting us know. We did consider Prometheus exporter earlier, but decided to stick to the native API capabilities. We'll need to revisit this. Which release are you planning to remove the restful API from?

epuertat · 2021-06-08T12:51:37Z

@sorantis: v17 (codenamed Quincy) to be released by first half of 2022. Please let us know if you need any guidance on this.

sorantis · 2021-06-08T12:56:28Z

@epuertat good to know. Any plans to support Prometheus endpoint natively? AFAIK today the user will have to manually enable the exporter via ceph mgr module enable prometheus.

cc @akshay-saraswat

epuertat · 2021-06-08T15:43:56Z

@sorantis, no plans to change that. The Prometheus exporter is embedded inside a Ceph service. It's probably the reference 'metrics agent' for the Ceph project (others are less maintained, like influx, telegraf, zabbix, ...).

The main downside I see there is that it only supports plain-text HTTP, but if you really need HTTPS, it wouldn't be that hard to get that change in [ceph-dashboard sample HTTPS Cherrypy config].

ruflin added enhancement module Metricbeat Metricbeat labels Jul 25, 2018

ruflin mentioned this issue Jul 25, 2018

Release the Ceph Metricbeat module as GA #7661

Closed

ruflin added the Team:Integrations Label for the Integrations team label Nov 21, 2018

alvarolobato mentioned this issue Dec 4, 2018

Metricbeat Ceph stats collection breaks for Ceph Mimic release #7429

Closed

exekias added the candidate Candidate to be added to the current iteration label Oct 30, 2019

sorantis mentioned this issue Nov 28, 2019

[Metricbeat] Ceph Module Compatibility #14845

Closed

andresrc added [zube]: Backlog Team:Services (Deprecated) Label for the former Integrations-Services team v7.7.0 [zube]: Inbox [zube]: Ready and removed [zube]: Backlog [zube]: Inbox labels Jan 27, 2020

mtojek self-assigned this Feb 4, 2020

mtojek added [zube]: In Progress and removed [zube]: Ready labels Feb 4, 2020

mtojek mentioned this issue Feb 5, 2020

[Metricbeat] Add Redis node and proxy #15269

Closed

10 tasks

This was referenced Feb 8, 2020

Fix: don't miss address scheme #16205

Merged

Cherry-pick #16205 to 7.x: Fix: don't miss address scheme #16241

Merged

[Metricbeat] Update Ceph module to support new API #16253

Closed

mtojek mentioned this issue Feb 11, 2020

[Metricbeat] Update Ceph module to support new API #16254

Merged

mtojek added [zube]: In Review and removed [zube]: In Progress labels Feb 14, 2020

mtojek mentioned this issue Feb 19, 2020

Cherry-pick #16254 to 7.x: [Metricbeat] Update Ceph module to support new API #16404

Merged

mtojek closed this as completed Feb 19, 2020

zube bot added [zube]: Done and removed [zube]: In Review labels Feb 19, 2020

andresrc removed the [zube]: Done label Feb 24, 2020

tchaikov mentioned this issue Jun 8, 2021

vstart.sh: disable restful by default ceph/ceph#41689

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Ceph module to support new API #7723

Update Ceph module to support new API #7723

ruflin commented Jul 25, 2018

mtojek commented Feb 4, 2020

mtojek commented Feb 5, 2020 •

edited

Loading

sorantis commented Feb 5, 2020

mtojek commented Feb 5, 2020

sorantis commented Feb 5, 2020

mtojek commented Feb 5, 2020

mtojek commented Feb 6, 2020

mtojek commented Feb 7, 2020 •

edited

Loading

mtojek commented Feb 19, 2020 •

edited

Loading

toha70 commented Feb 26, 2020

mtojek commented Feb 26, 2020

toha70 commented Feb 26, 2020

epuertat commented Jun 8, 2021

sorantis commented Jun 8, 2021

epuertat commented Jun 8, 2021

sorantis commented Jun 8, 2021 •

edited

Loading

epuertat commented Jun 8, 2021

Update Ceph module to support new API #7723

Update Ceph module to support new API #7723

Comments

ruflin commented Jul 25, 2018

mtojek commented Feb 4, 2020

mtojek commented Feb 5, 2020 • edited Loading

sorantis commented Feb 5, 2020

mtojek commented Feb 5, 2020

sorantis commented Feb 5, 2020

mtojek commented Feb 5, 2020

mtojek commented Feb 6, 2020

mtojek commented Feb 7, 2020 • edited Loading

mtojek commented Feb 19, 2020 • edited Loading

toha70 commented Feb 26, 2020

mtojek commented Feb 26, 2020

toha70 commented Feb 26, 2020

epuertat commented Jun 8, 2021

sorantis commented Jun 8, 2021

epuertat commented Jun 8, 2021

sorantis commented Jun 8, 2021 • edited Loading

epuertat commented Jun 8, 2021

mtojek commented Feb 5, 2020 •

edited

Loading

mtojek commented Feb 7, 2020 •

edited

Loading

mtojek commented Feb 19, 2020 •

edited

Loading

sorantis commented Jun 8, 2021 •

edited

Loading