Containerized SSH-less provisioner #134

jp39 · 2024-09-16T09:39:57Z

Hi,

The documentation states:

Making a container image and creating ZFS datasets from a container is not exactly easy, as ZFS runs in kernel. While it's possible to pass /dev/zfs to a container so it can create and destroy datasets within the container, sharing the volume with NFS does not work.

Setting sharenfs property to anything other than off invokes exportfs(8), that requires also running the NFS Server to reload its exports. Which is not the case in a container (see zfs(8)).

But most importantly: Mounting /dev/zfs inside the provisioner container would mean that the datasets will only be created on the same host as the container currently runs.

So, in order to "break out" of the container the zfs calls are wrapped and redirected to another host over SSH. This requires SSH private keys to be mounted in the container for a SSH user with sufficient permissions to run zfs commands on the target host.

I spent some time working on a small proof of concept that shows it is possible to create ZFS dataset from within a container and have the volumes shared with NFS by the container. Also, the volume mounts are visible by both the host and the container, making it shareable using HostPath.

I'm using this Dockerfile:

FROM docker.io/library/alpine:3.20 as runtime

ENTRYPOINT ["/entrypoint.sh"]

RUN apk add bash zfs nfs-utils

COPY kubernetes-zfs-provisioner /usr/bin/
COPY entrypoint.sh /

With this entrypoint.sh:

#!/bin/sh

rpcbind
rpc.statd --no-notify --port 32765 --outgoing-port 32766
rpc.mountd --port 32767
rpc.idmapd
rpc.nfsd --tcp --udp --port 2049 8

exec /usr/bin/kubernetes-zfs-provisioner

The secret sauce is to use mountPropagation: Bidirectional for the dataset volume mount, so each dataset mounted by the container is also visible in the host and vice versa:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: zfs-provisioner
  namespace: zfs-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: zfs-provisioner
  template:
    metadata:
      labels:
        app.kubernetes.io/name: zfs-provisioner
      namespace: zfs-system
    spec:
      serviceAccountName: zfs-provisioner
      containers:
      - name: provisionner
        image: jp39/zfs:latest
        volumeMounts:
        - name: dev-zfs
          mountPath: /dev/zfs
        - name: dataset
          mountPath: /tank/kubernetes
          mountPropagation: Bidirectional
        securityContext:
          privileged: true
          procMount: Unmasked
        ports:
        - containerPort: 2049
          protocol: TCP
        - containerPort: 111
          protocol: UDP
        - containerPort: 32765
          protocol: UDP
        - containerPort: 32767
          protocol: UDP
        env:
        - name: ZFS_NFS_HOSTNAME
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
      volumes:
      - name: dev-zfs
        hostPath:
          path: /dev/zfs
      - name: dataset
        hostPath:
          path: /tank/kubernetes
      nodeSelector:
        kubernetes.io/hostname: zfsnode

Note that I had to make a small patch within kubernetes-zfs-provisioner so that the pod IP address (contained in the ZFS_NFS_HOSTNAME environment variable) gets used a the NFSVolumeSource's server address instead of the storage class's
hostname parameter.

Is this something that would be worth having as a default configuration? It needs the ZFS host to be part of the cluster, but has the advantage not to require extra setup such as SSH keys.

The text was updated successfully, but these errors were encountered:

ccremer · 2024-09-16T14:20:26Z

Hi
This is indeed interesting! Thanks for experimenting with this!
Is the privileged security context really necessary? This makes it cumbersome to make it the default, as that requires elevated permissions to install the provisioner in certain contexts, e.g. OpenShift or ArgoCD (or at least make it more difficult).

jp39 · 2024-09-16T14:29:58Z

I will try without it. I assumed it was necessary because of the /dev/zfs hostPath mount, but maybe I'm wrong.

jp39 · 2024-09-16T14:34:02Z

It does not seem to work without the privileged security context:

# * spec.template.spec.containers[0].volumeMounts.mountPropagation: Forbidden: Bidirectional mount propagation is available only to privileged containers

jp39 · 2024-09-16T14:36:20Z

It does seem legit though that a container having access to the host mount namespace as well as the ability to create or destroy datasets on the host requires elevated permissions.

ccremer · 2024-09-17T08:44:11Z

Hm, yeah makes sense.
I would refrain from making it the default configuration if it requires privileged containers and running on ZFS-enabled hosts. I would opt to provide a preset for it, e.g. an additional values yaml file, or an entirely different deployment template if a parameter is given.

jp39 · 2024-09-19T18:29:29Z

It seems that there would be some architectural changes to make in the project if we wanted to integrate this feature.

For example, the container running the provisioner would have to be pinned on the ZFS node at the deployment level, therefore rendering the storage class node and hostname parameters useless.

Similarly, the mount path of the parent dataset has to be provided at the deployment level too, making it impossible to select a different parent dataset using storage class parameters.

Overall, it feels like it would make things less "flexible" although I'm pretty sure most users only use a single parent dataset on a single ZFS host (with a single storage class).

I think it will be very difficult to make the SSH-less and the SSH use case coexist, so if you're not prepared to give up the SSH use case, it's probably not worth carrying on with it.

Since it fits my own use-case better, I'm actually considering creating a fork of this project to implement this fully, if you don't have any objection.

ccremer · 2024-09-20T09:30:21Z

I agree with you.

I have no data what most users use for their setup, so any statement about usage are probably equally true :D

I think it will be very difficult to make the SSH-less and the SSH use case coexist

I agree.

Since it fits my own use-case better, I'm actually considering creating a fork of this project to implement this fully, if you don't have any objection.

No objection at all. I may even link to your repo for users that may interested in SSH-less version ;)

See ccremer#134

jp39 pushed a commit to jp39/zfs-provisioner that referenced this issue Sep 20, 2024

SSH-less rearchitecture

e9ab15d

See ccremer#134

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Containerized SSH-less provisioner #134

Containerized SSH-less provisioner #134

jp39 commented Sep 16, 2024

ccremer commented Sep 16, 2024 •

edited

Loading

jp39 commented Sep 16, 2024

jp39 commented Sep 16, 2024

jp39 commented Sep 16, 2024

ccremer commented Sep 17, 2024

jp39 commented Sep 19, 2024

ccremer commented Sep 20, 2024

Containerized SSH-less provisioner #134

Containerized SSH-less provisioner #134

Comments

jp39 commented Sep 16, 2024

ccremer commented Sep 16, 2024 • edited Loading

jp39 commented Sep 16, 2024

jp39 commented Sep 16, 2024

jp39 commented Sep 16, 2024

ccremer commented Sep 17, 2024

jp39 commented Sep 19, 2024

ccremer commented Sep 20, 2024

ccremer commented Sep 16, 2024 •

edited

Loading