Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.0.0] No /dev/nvme0n1 and null nvme list on EL9.3 client with --enable-ha and --ana-reporting #459

Open
FredNass opened this issue Feb 23, 2024 · 3 comments

Comments

@FredNass
Copy link

FredNass commented Feb 23, 2024

Hello,

In order to evaluate how ceph-nvmeof behaves in a HA setup, I've set up 2 GWs as per below:

ceph-nvmeof.conf of GW #1 test-lis04h02:

# This file is generated by cephadm.
[gateway]
name = client.nvmeof.nvmeof_pool01.test-lis04h02.baakhx
group = None
addr = 100.74.191.134
port = 5500
enable_auth = False
state_update_notify = True
state_update_interval_sec = 5
enable_spdk_discovery_controller = true
enable_prometheus_exporter = false

[discovery]
addr = 100.74.191.134
port = 8009

[ceph]
pool = nvmeof_pool01
config_file = /etc/ceph/ceph.conf
id = nvmeof.nvmeof_pool01.test-lis04h02.baakhx

#[mtls]
#server_key = ./server.key
#client_key = ./client.key
#server_cert = ./server.crt
#client_cert = ./client.crt

[spdk]
tgt_path = /usr/local/bin/nvmf_tgt
rpc_socket = /var/tmp/spdk.sock
timeout = 60
log_level = WARN
conn_retries = 10
transports = tcp
transport_tcp_options = {"in_capsule_data_size": 8192, "max_io_qpairs_per_ctrlr": 7}
tgt_cmd_extra_args = --cpumask=0xF

ceph-nvmeof.conf of GW #2 test-mom02h02:

# This file is generated by cephadm.
[gateway]
name = client.nvmeof.nvmeof_pool01.test-mom02h02.lcdzzc
group = None
addr = 100.74.191.130
port = 5500
enable_auth = False
state_update_notify = True
state_update_interval_sec = 5
enable_spdk_discovery_controller = true
enable_prometheus_exporter = false

[discovery]
addr = 100.74.191.130
port = 8009

[ceph]
pool = nvmeof_pool01
config_file = /etc/ceph/ceph.conf
id = nvmeof.nvmeof_pool01.test-mom02h02.lcdzzc

#[mtls]
#server_key = ./server.key
#client_key = ./client.key
#server_cert = ./server.crt
#client_cert = ./client.crt

[spdk]
tgt_path = /usr/local/bin/nvmf_tgt
rpc_socket = /var/tmp/spdk.sock
timeout = 60
log_level = WARN
conn_retries = 10
transports = tcp
transport_tcp_options = {"in_capsule_data_size": 8192, "max_io_qpairs_per_ctrlr": 7}
tgt_cmd_extra_args = --cpumask=0xF

Deleting rados config object:

rados -p nvmeof_pool01 rm nvmeof.None.state

Started two [1.0.0] containers on nodes test-lis04h02 and test-mom02h02 with below options:

/usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --authfile=/etc/ceph/podman-auth.json --net=host --init --name ceph-aa558815-042c-4fce-ac37-80c0255bf3c0-nvmeof-nvmeof_pool01-test-lis04h02-baakhx --pids-limit=-1 --ulimit memlock=-1:-1 --ulimit nofile=10240 --cap-add=SYS_ADMIN --cap-add=CAP_SYS_NICE --log-driver journald --conmon-pidfile /run/[email protected]_pool01.test-lis04h02.baakhx.service-pid --cidfile /run/[email protected]_pool01.test-lis04h02.baakhx.service-cid --cgroups=split -e CONTAINER_IMAGE=quay.io/ceph/nvmeof:1.0.0 -e NODE_NAME=test-lis04h02.peta.libe.dc.univ-lorraine.fr -e CEPH_USE_RANDOM_NONCE=1 -v /var/lib/ceph/aa558815-042c-4fce-ac37-80c0255bf3c0/nvmeof.nvmeof_pool01.test-lis04h02.baakhx/config:/etc/ceph/ceph.conf:z -v /var/lib/ceph/aa558815-042c-4fce-ac37-80c0255bf3c0/nvmeof.nvmeof_pool01.test-lis04h02.baakhx/keyring:/etc/ceph/keyring:z -v /var/lib/ceph/aa558815-042c-4fce-ac37-80c0255bf3c0/nvmeof.nvmeof_pool01.test-lis04h02.baakhx/ceph-nvmeof.conf:/src/ceph-nvmeof.conf:z -v /var/lib/ceph/aa558815-042c-4fce-ac37-80c0255bf3c0/nvmeof.nvmeof_pool01.test-lis04h02.baakhx/configfs:/sys/kernel/config -v /dev/hugepages:/dev/hugepages -v /dev/vfio/vfio:/dev/vfio/vfio -v /etc/hosts:/etc/hosts:ro --mount type=bind,source=/lib/modules,destination=/lib/modules,ro=true quay.io/ceph/nvmeof:1.0.0
/usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --authfile=/etc/ceph/podman-auth.json --net=host --init --name ceph-aa558815-042c-4fce-ac37-80c0255bf3c0-nvmeof-nvmeof_pool01-test-mom02h02-lcdzzc --pids-limit=-1 --ulimit memlock=-1:-1 --ulimit nofile=10240 --cap-add=SYS_ADMIN --cap-add=CAP_SYS_NICE --log-driver journald --conmon-pidfile /run/[email protected]_pool01.test-mom02h02.lcdzzc.service-pid --cidfile /run/[email protected]_pool01.test-mom02h02.lcdzzc.service-cid --cgroups=split -e CONTAINER_IMAGE=quay.io/ceph/nvmeof:1.0.0 -e NODE_NAME=test-mom02h02.peta.montet.dc.univ-lorraine.fr -e CEPH_USE_RANDOM_NONCE=1 -v /var/lib/ceph/aa558815-042c-4fce-ac37-80c0255bf3c0/nvmeof.nvmeof_pool01.test-mom02h02.lcdzzc/config:/etc/ceph/ceph.conf:z -v /var/lib/ceph/aa558815-042c-4fce-ac37-80c0255bf3c0/nvmeof.nvmeof_pool01.test-mom02h02.lcdzzc/keyring:/etc/ceph/keyring:z -v /var/lib/ceph/aa558815-042c-4fce-ac37-80c0255bf3c0/nvmeof.nvmeof_pool01.test-mom02h02.lcdzzc/ceph-nvmeof.conf:/src/ceph-nvmeof.conf:z -v /var/lib/ceph/aa558815-042c-4fce-ac37-80c0255bf3c0/nvmeof.nvmeof_pool01.test-mom02h02.lcdzzc/configfs:/sys/kernel/config -v /dev/hugepages:/dev/hugepages -v /dev/vfio/vfio:/dev/vfio/vfio -v /etc/hosts:/etc/hosts:ro --mount type=bind,source=/lib/modules,destination=/lib/modules,ro=true quay.io/ceph/nvmeof:1.0.0

And configured the GWs:

podman run -it quay.io/ceph/nvmeof-cli:1.0.0 --server-address 100.74.191.134 --server-port 5500 subsystem add --subsystem "nqn.2024-02.io.spdk:peta" --enable-ha --ana-reporting
Adding subsystem nqn.2024-02.io.spdk:peta: Successful

podman run -it quay.io/ceph/nvmeof-cli:1.0.0 --server-address 100.74.191.134 --server-port 5500 namespace add --subsystem "nqn.2024-02.io.spdk:peta" --rbd-pool nvmeof_pool01 --rbd-image image-1 --size 3G --rbd-create-image --nsid 1 --load-balancing-group 1
Adding namespace 1 to nqn.2024-02.io.spdk:peta, load balancing group 1: Successful

podman run -it quay.io/ceph/nvmeof-cli:1.0.0 --server-address 100.74.191.134 --server-port 5500 listener add --subsystem "nqn.2024-02.io.spdk:peta" --gateway-name client.nvmeof.nvmeof_pool01.test-lis04h02.baakhx --traddr 100.74.191.134 --trsvcid 4420
Adding nqn.2024-02.io.spdk:peta listener at 100.74.191.134:4420: Successful

podman run -it quay.io/ceph/nvmeof-cli:1.0.0 --server-address 100.74.191.130 --server-port 5500 listener add --subsystem "nqn.2024-02.io.spdk:peta" --gateway-name client.nvmeof.nvmeof_pool01.test-mom02h02.lcdzzc --traddr 100.74.191.130 --trsvcid 4420
Adding nqn.2024-02.io.spdk:peta listener at 100.74.191.130:4420: Successful

podman run -it quay.io/ceph/nvmeof-cli:1.0.0 --server-address 100.74.191.130 --server-port 5500 host add --subsystem "nqn.2024-02.io.spdk:peta" --host "*"
Allowing open host access to nqn.2024-02.io.spdk:peta: Successful

suybsystem and namespace configuration check:

$ podman run -it quay.io/ceph/nvmeof-cli:1.0.0 --server-address 100.74.191.130 --server-port 5500 subsystem list
Subsystems:
╒═══════════╤══════════════════════════════════════╤════════════╤════════════════════╤══════════════════════╤══════════════════╤═════════════╕
│ Subtype   │ NQN                                  │ HA State   │ Serial Number      │ Model Number         │ Controller IDs   │   Namespace │
│           │                                      │            │                    │                      │                  │       Count │
╞═══════════╪══════════════════════════════════════╪════════════╪════════════════════╪══════════════════════╪══════════════════╪═════════════╡
│ Discovery │ nqn.2014-08.org.nvmexpress.discovery │ disabled   │                    │                      │ 0-0              │           0 │
├───────────┼──────────────────────────────────────┼────────────┼────────────────────┼──────────────────────┼──────────────────┼─────────────┤
│ NVMe      │ nqn.2024-02.io.spdk:peta             │ enabled    │ SPDK77352934285842 │ SPDK bdev Controller │ 1-65519          │           1 │
╘═══════════╧══════════════════════════════════════╧════════════╧════════════════════╧══════════════════════╧══════════════════╧═════════════╛

$ podman run -it quay.io/ceph/nvmeof-cli:1.0.0 --server-address 100.74.191.134 --server-port 5500 subsystem list
Subsystems:
╒═══════════╤══════════════════════════════════════╤════════════╤════════════════════╤══════════════════════╤══════════════════╤═════════════╕
│ Subtype   │ NQN                                  │ HA State   │ Serial Number      │ Model Number         │ Controller IDs   │   Namespace │
│           │                                      │            │                    │                      │                  │       Count │
╞═══════════╪══════════════════════════════════════╪════════════╪════════════════════╪══════════════════════╪══════════════════╪═════════════╡
│ Discovery │ nqn.2014-08.org.nvmexpress.discovery │ disabled   │                    │                      │ 0-0              │           0 │
├───────────┼──────────────────────────────────────┼────────────┼────────────────────┼──────────────────────┼──────────────────┼─────────────┤
│ NVMe      │ nqn.2024-02.io.spdk:peta             │ enabled    │ SPDK77352934285842 │ SPDK bdev Controller │ 1-65519          │           1 │
╘═══════════╧══════════════════════════════════════╧════════════╧════════════════════╧══════════════════════╧══════════════════╧═════════════╛

$ podman run -it quay.io/ceph/nvmeof-cli:1.0.0 --server-address 100.74.191.130 --server-port 5500 namespace list -n "nqn.2024-02.io.spdk:peta" 
Namespaces in subsystem nqn.2024-02.io.spdk:peta:
╒════════╤════════════════════════╤═══════════════╤═════════╤═════════╤═════════╤═════════════════════╤═════════════╤═══════════╤═══════════╤════════════╤═════════════╕
│   NSID │ Bdev                   │ RBD           │ RBD     │ Image   │ Block   │ UUID                │        Load │ R/W IOs   │ R/W MBs   │ Read MBs   │ Write MBs   │
│        │ Name                   │ Pool          │ Image   │ Size    │ Size    │                     │   Balancing │ per       │ per       │ per        │ per         │
│        │                        │               │         │         │         │                     │       Group │ second    │ second    │ second     │ second      │
╞════════╪════════════════════════╪═══════════════╪═════════╪═════════╪═════════╪═════════════════════╪═════════════╪═══════════╪═══════════╪════════════╪═════════════╡
│      1 │ bdev_96ae08eb-fa83-    │ nvmeof_pool01 │ image-1 │ 3 GiB   │ 512 B   │ 96ae08eb-fa83-4fd8- │           1 │ unlimited │ unlimited │ unlimited  │ unlimited   │
│        │ 4fd8-8f82-2e9fee187e91 │               │         │         │         │ 8f82-2e9fee187e91   │             │           │           │            │             │
╘════════╧════════════════════════╧═══════════════╧═════════╧═════════╧═════════╧═════════════════════╧═════════════╧═══════════╧═══════════╧════════════╧═════════════╛

What I observe: /dev/nvme0n1 is not created on test-mom02h01 EL 9.3 client.

$ nvme discover -t tcp -a 100.74.191.134 -s 4420

Discovery Log Number of Records 1, Generation counter 2
=====Discovery Log Entry 0======
trtype:  tcp
adrfam:  ipv4
subtype: nvme subsystem
treq:    not required
portid:  0
trsvcid: 4420
subnqn:  nqn.2024-02.io.spdk:peta
traddr:  100.74.191.134
eflags:  none
sectype: none

$ nvme connect -t tcp -a 100.74.191.134 -n nqn.2024-02.io.spdk:peta

$ nvme list
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev  
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------

$ nvme list -v 
Subsystem        Subsystem-NQN                                                                                    Controllers
---------------- ------------------------------------------------------------------------------------------------ ----------------
nvme-subsys0     nqn.2024-02.io.spdk:peta                                                                         nvme0

Device   SN                   MN                                       FR       TxPort Address        Subsystem    Namespaces      
-------- -------------------- ---------------------------------------- -------- ------ -------------- ------------ ----------------
nvme0    SPDK77352934285842   SPDK bdev Controller                     23.01.1  tcp    traddr=100.74.191.134,trsvcid=4420,src_addr=100.74.191.129 nvme-subsys0 

Device       Generic      NSID       Usage                      Format           Controllers     
------------ ------------ ---------- -------------------------- ---------------- ----------------

$ ls -alh /dev/nvme*
crw------- 1 root root 241,   0 23 févr. 11:03 /dev/nvme0
crw------- 1 root root  10, 123 23 févr. 10:21 /dev/nvme-fabrics

$ fdisk -l /dev/nvme0
fdisk: impossible d'ouvrir /dev/nvme0: Repérage non permis

$ dmesg
[...]
[61797.892756] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 100.74.191.134:4420
[61797.895072] nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[61802.258173] nvme nvme0: creating 4 I/O queues.
[61802.297181] nvme nvme0: mapped 4/0/0 default/read/poll queues.
[61802.299326] nvme nvme0: new ctrl: NQN "nqn.2024-02.io.spdk:peta", addr 100.74.191.134:4420

/dev/nvme0n1 is not created on the client.

What I expect: /dev/nvme0n1 being created and usable on the client.

Note: reproducing these exact commands without using --enable-ha and --ana-reporting options when creating the subsystem results in /dev/nvme0n1 being created on the client as expected.

@github-project-automation github-project-automation bot moved this to 🆕 New in NVMe-oF Feb 23, 2024
@FredNass FredNass changed the title [1.0.0] No /dev/nvme0n1 and null nvme list on EL9.3 client with --enable-ha --ana-reporting [1.0.0] No /dev/nvme0n1 and null nvme list on EL9.3 client with --enable-ha and --ana-reporting Feb 23, 2024
@caroav
Copy link
Collaborator

caroav commented Feb 25, 2024

@FredNass you are correct. The auto HA mode is still not functional. It waits for the nvmeof monitor code to be merged to ceph. You can see here.
The temporary way to use auto ha now is to use a special ceph builds, with a sha that we update when needed here (CEPH_SHA=).

@FredNass
Copy link
Author

Thank you @caroav. Are there any publicly available containers associated with this special build (ceph-19.0.0-1365.g4b2ae236) that one could use for testing purposes? By changing container_image setting of a test cluster for example?

ceph config set global container_image quay.io/ceph/ceph@sha256:<CEPH_SHA>

@caroav
Copy link
Collaborator

caroav commented Feb 28, 2024

@FredNass I think it makes more sense to wait until we will merge it upstream. Should be very close. Few days I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🆕 New
Development

No branches or pull requests

2 participants