Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load balancing group of namespace gets assigned to Gateway not part of group for which namespace is added #1034

Open
rahullepakshi opened this issue Jan 15, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@rahullepakshi
Copy link
Contributor

Seeing a strange yet major issue where nvme-gw show command lists Gateway not part of griuo mentioned in command

Below, node7 is part of group2 but show command lists in group1 and namespaces added on GW inn group1 takes load balancing group of node7 which is actually part of group2

# ceph nvme-gw show nvmeof_pool group1
{
    "epoch": 362,
    "pool": "nvmeof_pool",
    "group": "group1",
    "features": "LB",
    "rebalance_ana_group": 5,
    "num gws": 5,
    "Anagrp list": "[ 1 2 3 4 5 ]",
    "num-namespaces": 6,
    "Created Gateways:": [
        {
            "gw-id": "client.nvmeof.nvmeof_pool.group1.ceph-rlepaksh-8-1-ZKVGK0-node3.eubjtw",
            "anagrp-id": 1,
            "num-namespaces": 1,
            "performed-full-startup": 1,
            "Availability": "AVAILABLE",
            "num-listeners": 2,
            "ana states": " 1: ACTIVE ,  2: ACTIVE ,  3: STANDBY ,  4: STANDBY ,  5: ACTIVE "
        },
        {
            "gw-id": "client.nvmeof.nvmeof_pool.group1.ceph-rlepaksh-8-1-ZKVGK0-node4.krmbra",
            "anagrp-id": 2,
            "num-namespaces": 1,
            "performed-full-startup": 0,
            "Availability": "UNAVAILABLE",
            "ana states": " 1: STANDBY ,  2: STANDBY ,  3: STANDBY ,  4: STANDBY ,  5: STANDBY "
        },
        {
            "gw-id": "client.nvmeof.nvmeof_pool.group1.ceph-rlepaksh-8-1-ZKVGK0-node5.javqon",
            "anagrp-id": 3,
            "num-namespaces": 1,
            "performed-full-startup": 1,
            "Availability": "AVAILABLE",
            "num-listeners": 2,
            "ana states": " 1: STANDBY ,  2: STANDBY ,  3: ACTIVE ,  4: STANDBY ,  5: STANDBY "
        },
        {
            "gw-id": "client.nvmeof.nvmeof_pool.group1.ceph-rlepaksh-8-1-ZKVGK0-node6.gspnwl",
            "anagrp-id": 4,
            "num-namespaces": 1,
            "performed-full-startup": 1,
            "Availability": "AVAILABLE",
            "num-listeners": 2,
            "ana states": " 1: STANDBY ,  2: STANDBY ,  3: STANDBY ,  4: ACTIVE ,  5: STANDBY "
        },
        {
            "gw-id": "client.nvmeof.nvmeof_pool.group2.ceph-rlepaksh-8-1-ZKVGK0-node7.uuwgjy",
            "anagrp-id": 5,
            "num-namespaces": 2,
            "performed-full-startup": 1,
            "Availability": "CREATED",
            "ana states": " 1: STANDBY ,  2: STANDBY ,  3: STANDBY ,  4: STANDBY ,  5: STANDBY "
        }
    ]
}

# ceph orch ps | grep nvmeof
nvmeof.nvmeof_pool.group1.ceph-rlepaksh-8-1-ZKVGK0-node3.eubjtw  ceph-rlepaksh-8-1-ZKVGK0-node3            *:5500,4420,8009  running (8d)      7m ago   8d     182M        -  1.4.2                  a09f7e144ac2  ddf9ee88cbd6
nvmeof.nvmeof_pool.group1.ceph-rlepaksh-8-1-ZKVGK0-node4.krmbra  ceph-rlepaksh-8-1-ZKVGK0-node4            *:5500,4420,8009  running (4d)     72s ago   8d     170M        -  1.4.2                  a09f7e144ac2  cbcd128e26a1
nvmeof.nvmeof_pool.group1.ceph-rlepaksh-8-1-ZKVGK0-node5.javqon  ceph-rlepaksh-8-1-ZKVGK0-node5            *:5500,4420,8009  running (9h)      6m ago   8d     162M        -  1.4.2                  a09f7e144ac2  1719b34901b2
nvmeof.nvmeof_pool.group1.ceph-rlepaksh-8-1-ZKVGK0-node6.gspnwl  ceph-rlepaksh-8-1-ZKVGK0-node6            *:5500,4420,8009  running (23h)    71s ago   8d     169M        -  1.4.2                  a09f7e144ac2  540178485ce0
nvmeof.nvmeof_pool.group2.ceph-rlepaksh-8-1-ZKVGK0-node7.uuwgjy  ceph-rlepaksh-8-1-ZKVGK0-node7            *:5500,4420,8009  running (16h)    72s ago  16h     219M        -  1.4.2                  a09f7e144ac2  c7ba1dbf96c4
[ceph: root@ceph-rlepaksh-8-1-zkvgk0-node1-installer /]# ceph orch ls | grep nvmeof
nvmeof.nvmeof_pool.group1  ?:4420,5500,8009      4/4  7m ago     8d   ceph-rlepaksh-8-1-ZKVGK0-node3;ceph-rlepaksh-8-1-ZKVGK0-node4;ceph-rlepaksh-8-1-ZKVGK0-node5;ceph-rlepaksh-8-1-ZKVGK0-node6
nvmeof.nvmeof_pool.group2  ?:4420,5500,8009      1/1  77s ago    16h  ceph-rlepaksh-8-1-ZKVGK0-node7

# podman run -it quay.io/ceph/nvmeof-cli:latest --server-address 10.0.64.80 namespace list -n nqn.2016-06.io.spdk:cnode1.group1
Namespaces in subsystem nqn.2016-06.io.spdk:cnode1.group1:
╒════════╤════════════════════════╤════════════╤═════════╤═══════════╤═════════════════════╤═════════════╤══════════════╤═══════════╤═══════════╤════════════╤═════════════╕
│   NSID │ Bdev                   │ RBD        │ Image   │ Block     │ UUID                │        Load │ Visibility   │ R/W IOs   │ R/W MBs   │ Read MBs   │ Write MBs   │
│        │ Name                   │ Image      │ Size    │ Size      │                     │   Balancing │              │ per       │ per       │ per        │ per         │
│        │                        │            │         │           │                     │       Group │              │ second    │ second    │ second     │ second      │
╞════════╪════════════════════════╪════════════╪═════════╪═══════════╪═════════════════════╪═════════════╪══════════════╪═══════════╪═══════════╪════════════╪═════════════╡
│      1 │ bdev_b58e4188-9070-    │ rbd/image0 │ 1 TiB   │ 512 Bytes │ b58e4188-9070-40d4- │           1 │ All Hosts    │ unset     │ unset     │ unset      │ unset       │
│        │ 40d4-8c4a-d37056f1e6d9 │            │         │           │ 8c4a-d37056f1e6d9   │             │              │           │           │            │             │
├────────┼────────────────────────┼────────────┼─────────┼───────────┼─────────────────────┼─────────────┼──────────────┼───────────┼───────────┼────────────┼─────────────┤
│      2 │ bdev_c376f74c-6924-    │ rbd/image1 │ 1 TiB   │ 512 Bytes │ c376f74c-6924-4b78- │           2 │ All Hosts    │ unset     │ unset     │ unset      │ unset       │
│        │ 4b78-bb32-13b16ee0cb10 │            │         │           │ bb32-13b16ee0cb10   │             │              │           │           │            │             │
├────────┼────────────────────────┼────────────┼─────────┼───────────┼─────────────────────┼─────────────┼──────────────┼───────────┼───────────┼────────────┼─────────────┤
│      3 │ bdev_b9a64e8a-2dae-    │ rbd/image2 │ 1 TiB   │ 512 Bytes │ b9a64e8a-2dae-487e- │           3 │ All Hosts    │ unset     │ unset     │ unset      │ unset       │
│        │ 487e-96c7-112534761544 │            │         │           │ 96c7-112534761544   │             │              │           │           │            │             │
├────────┼────────────────────────┼────────────┼─────────┼───────────┼─────────────────────┼─────────────┼──────────────┼───────────┼───────────┼────────────┼─────────────┤
│      4 │ bdev_395abf33-097f-    │ rbd/image3 │ 1 TiB   │ 512 Bytes │ 395abf33-097f-4138- │           4 │ All Hosts    │ unset     │ unset     │ unset      │ unset       │
│        │ 4138-a0a8-ec0eda9af506 │            │         │           │ a0a8-ec0eda9af506   │             │              │           │           │            │             │
├────────┼────────────────────────┼────────────┼─────────┼───────────┼─────────────────────┼─────────────┼──────────────┼───────────┼───────────┼────────────┼─────────────┤
│      5 │ bdev_cb68c737-b180-    │ rbd/image4 │ 1 TiB   │ 512 Bytes │ cb68c737-b180-4642- │           5 │ All Hosts    │ unset     │ unset     │ unset      │ unset       │
│        │ 4642-868c-337e65b80e1b │            │         │           │ 868c-337e65b80e1b   │             │              │           │           │            │             │
├────────┼────────────────────────┼────────────┼─────────┼───────────┼─────────────────────┼─────────────┼──────────────┼───────────┼───────────┼────────────┼─────────────┤
│      6 │ bdev_24939ab7-138f-    │ rbd/image5 │ 1 TiB   │ 512 Bytes │ 24939ab7-138f-4d41- │           5 │ All Hosts    │ unset     │ unset     │ unset      │ unset       │
│        │ 4d41-877c-ca8ad3f002a6 │            │         │           │ 877c-ca8ad3f002a6   │             │              │           │           │            │             │
╘════════╧════════════════════════╧════════════╧═════════╧═══════════╧═════════════════════╧═════════════╧══════════════╧═══════════╧═══════════╧════════════╧═════════════╛


@github-project-automation github-project-automation bot moved this to 🆕 New in NVMe-oF Jan 15, 2025
@rahullepakshi rahullepakshi added the bug Something isn't working label Jan 15, 2025
@rahullepakshi
Copy link
Contributor Author

Spec file

]# ceph orch ls nvmeof --export
service_type: nvmeof
service_id: nvmeof_pool.group1
service_name: nvmeof.nvmeof_pool.group1
placement:
  hosts:
  - ceph-rlepaksh-8-1-ZKVGK0-node3
  - ceph-rlepaksh-8-1-ZKVGK0-node4
  - ceph-rlepaksh-8-1-ZKVGK0-node5
  - ceph-rlepaksh-8-1-ZKVGK0-node6
spec:
  allowed_consecutive_spdk_ping_failures: 1
  bdevs_per_cluster: 32
  conn_retries: 10
  discovery_port: 8009
  enable_key_encryption: true
  enable_monitor_client: true
  enable_prometheus_exporter: true
  group: group1
  log_directory: /var/log/ceph/
  log_files_enabled: true
  log_files_rotation_enabled: true
  log_level: INFO
  max_gws_in_grp: 16
  max_hosts_per_namespace: 8
  max_hosts_per_subsystem: 32
  max_log_directory_backups: 10
  max_log_file_size_in_mb: 10
  max_log_files_count: 20
  max_namespaces: 1024
  max_namespaces_per_subsystem: 256
  max_namespaces_with_netmask: 1000
  max_ns_to_change_lb_grp: 8
  max_subsystems: 128
  monitor_timeout: 1.0
  omap_file_lock_duration: 20
  omap_file_lock_retries: 30
  omap_file_lock_retry_sleep_interval: 1.0
  omap_file_update_reloads: 10
  pool: nvmeof_pool
  port: 5500
  prometheus_port: 10008
  prometheus_stats_interval: 10
  rebalance_period_sec: 7
  rpc_socket_dir: /var/tmp/
  rpc_socket_name: spdk.sock
  spdk_path: /usr/local/bin/nvmf_tgt
  spdk_ping_interval_in_seconds: 2.0
  spdk_protocol_log_level: WARNING
  spdk_timeout: 60.0
  state_update_interval_sec: 5
  state_update_notify: true
  tgt_path: /usr/local/bin/nvmf_tgt
  transport_tcp_options:
    in_capsule_data_size: 8192
    max_io_qpairs_per_ctrlr: 7
  transports: tcp
  verbose_log_messages: true
  verify_keys: true
  verify_nqns: true
---
service_type: nvmeof
service_id: nvmeof_pool.group2
service_name: nvmeof.nvmeof_pool.group2
placement:
  hosts:
  - ceph-rlepaksh-8-1-ZKVGK0-node7
spec:
  allowed_consecutive_spdk_ping_failures: 1
  bdevs_per_cluster: 32
  conn_retries: 10
  discovery_port: 8009
  enable_key_encryption: true
  enable_monitor_client: true
  enable_prometheus_exporter: true
  group: group1
  log_directory: /var/log/ceph/
  log_files_enabled: true
  log_files_rotation_enabled: true
  log_level: INFO
  max_gws_in_grp: 16
  max_hosts_per_namespace: 8
  max_hosts_per_subsystem: 32
  max_log_directory_backups: 10
  max_log_file_size_in_mb: 10
  max_log_files_count: 20
  max_namespaces: 1024
  max_namespaces_per_subsystem: 256
  max_namespaces_with_netmask: 1000
  max_ns_to_change_lb_grp: 8
  max_subsystems: 128
  monitor_timeout: 1.0
  omap_file_lock_duration: 20
  omap_file_lock_retries: 30
  omap_file_lock_retry_sleep_interval: 1.0
  omap_file_update_reloads: 10
  pool: nvmeof_pool
  port: 5500
  prometheus_port: 10008
  prometheus_stats_interval: 10
  rebalance_period_sec: 7
  rpc_socket_dir: /var/tmp/
  rpc_socket_name: spdk.sock
  spdk_mem_size: 4096
  spdk_path: /usr/local/bin/nvmf_tgt
  spdk_ping_interval_in_seconds: 2.0
  spdk_protocol_log_level: WARNING
  spdk_timeout: 60.0
  state_update_interval_sec: 5
  state_update_notify: true
  tgt_path: /usr/local/bin/nvmf_tgt
  transport_tcp_options:
    in_capsule_data_size: 8192
    max_io_qpairs_per_ctrlr: 7
  transports: tcp
  verbose_log_messages: true
  verify_keys: true
  verify_nqns: true


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 🆕 New
Development

No branches or pull requests

1 participant