Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SecureComms: Fix needed following changes in other components #2073

Merged

Conversation

davidhadas
Copy link
Member

Change the trustee operator namespace
Add necessary initData to get SecureComms to work

@davidhadas davidhadas requested a review from a team as a code owner September 30, 2024 10:11
@davidhadas davidhadas force-pushed the secComms_fix_for_0.10 branch 2 times, most recently from f298965 to cd3c315 Compare September 30, 2024 10:18
Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look ago to me. Is anyone able to manually test this to validate it works in the SecureComms configuration?

@davidhadas davidhadas force-pushed the secComms_fix_for_0.10 branch 3 times, most recently from b4c8398 to e82a59a Compare October 7, 2024 12:30
@davidhadas
Copy link
Member Author

cc: @bpradipt

@stevenhorsman
Copy link
Member

I've tried to test this code change and doc (creating CAA image quay.io/stevenhorsman/cloud-api-adaptor:dev-3fa7867a7499f2a636179f83c68174330812113d-dirty and podvm quay.io/stevenhorsman/podvm-generic-ubuntu-amd64:80da436ca39f40d439da987a3c83271f9dffb0ae3ac5dc9283a7ce8d6a112c97) and it doesn't work for me.

The agent proxy never connected and times out after 5mins:

2024/10/09 13:33:19 [adaptor/proxy] Retrying agent proxy connection to 192.168.122.33:15150...
2024/10/09 13:33:24 [adaptor/proxy] Retrying agent proxy connection to 192.168.122.33:15150...
2024/10/09 13:33:24 [adaptor/cloud] Error: start instance interrupted (context canceled). Cleaning up...

Has anyone else successfully tested this - and tips for what I did wrong here?

@davidhadas
Copy link
Member Author

davidhadas commented Oct 9, 2024 via email

@stevenhorsman
Copy link
Member

stevenhorsman commented Oct 9, 2024

Did you follow the instructions to activate secure comms in SecureComms.md? The Caa in your envirinment is not configured to activate SecureComms. You can see this since caa is approaching 192.168.122.33:15150. When the peerpod ConfigMap is configured correctly, caa should approach 192.168.122.33:2222 instead. The proxy agent will later approach s9me local port instead of a peerpod port.

This is my peer-pod-cm:

kubectl -n confidential-containers-system  get cm peer-pods-cm  -o yaml
apiVersion: v1
data:
  CLOUD_CONFIG_VERIFY: "false"
  CLOUD_PROVIDER: libvirt
  DISABLECVM: "true"
  ENABLE_CLOUD_PROVIDER_EXTERNAL_PLUGIN: "false"
  INITDATA: YWxnb3JpdGhtID0gInNoYTM4NCIKdmVyc2lvbiA9ICIwLjEuMCIKW2RhdGFdCiJhYS50b21sIiA9ICcnJwpbdG9rZW5fY29uZmlnc10KW3Rva2VuX2NvbmZpZ3MuY29jb19hc10KdXJsID0gJ2h0dHA6Ly8xMjcuMC4wLjE6ODA4MCcKClt0b2tlbl9jb25maWdzLmtic10KdXJsID0gJ2h0dHA6Ly8xMjcuMC4wLjE6ODA4MCcKJycnCiJjZGgudG9tbCIgID0gJycnCnNvY2tldCA9ICd1bml4Oi8vL3J1bi9jb25maWRlbnRpYWwtY29udGFpbmVycy9jZGguc29jaycKY3JlZGVudGlhbHMgPSBbXQpba2JjXQpuYW1lID0gJ2NjX2tiYycKdXJsID0gJ2h0dHA6Ly8xMjcuMC4wLjE6ODA4MCcKJycnCg==
  LIBVIRT_NET: default
  LIBVIRT_POOL: default
  LIBVIRT_URI: qemu+ssh://[email protected]/system?no_verify=1
  SECURE_COMMS: "true"
kind: ConfigMap

@stevenhorsman
Copy link
Member

Ok, I've got it working. The doc doesn't mention that after editing peer-pods-cm, the changes don't come into effect until the CAA ds re-starts, so when I deleted the pod and the deployment re-created it it started working. Here is my CAA log for verification:

2024/10/09 15:24:44 [adaptor/cloud/libvirt] Creating VM 'podvm-nginx-75d4ffc6d9-6dhgb-8b317899'
2024/10/09 15:24:44 [adaptor/cloud/libvirt] Starting VM 'podvm-nginx-75d4ffc6d9-6dhgb-8b317899'
2024/10/09 15:24:45 [adaptor/cloud/libvirt] VM id 32
2024/10/09 15:25:06 [adaptor/cloud/libvirt] Instance created successfully
2024/10/09 15:25:06 [adaptor/cloud/libvirt] created an instance podvm-nginx-75d4ffc6d9-6dhgb-8b317899 for sandbox 8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:25:06 [util/k8sops] nginx-75d4ffc6d9-6dhgb is now owning a PeerPod object
2024/10/09 15:25:06 [adaptor/cloud] created an instance podvm-nginx-75d4ffc6d9-6dhgb-8b317899 for sandbox 8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:25:06 [secure-comms] InitPP read/create PP secret named: pp-8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:25:07 [secure-comms] CreateSecret 'pp-8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9'
2024/10/09 15:25:07 [secure-comms] Updating KBS with secret for: default/pp-8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9/privateKey
2024/10/09 15:25:07 [secure-comms] Inbound listening to port 45973
2024/10/09 15:25:07 [secure-comms] Attestation phase: starting
2024/10/09 15:25:07 [secure-comms] Attestation phase: unable to Dial 192.168.122.245:2222: dial tcp 192.168.122.245:2222: connect: connection refused
...
2024/10/09 15:26:03 [secure-comms] Attestation phase: unable to Dial 192.168.122.245:2222: dial tcp 192.168.122.245:2222: connect: connection refused
2024/10/09 15:26:08 [secure-comms] Attestation phase: ssh connected - 192.168.122.245:2222
2024/10/09 15:26:09 [secure-comms] Attestation phase: ssh skipped validating server's host key (type ssh-rsa) during attestation
2024/10/09 15:26:09 [secure-comms] Attestation phase: peer reported phase Attestation
2024/10/09 15:26:10 [secure-comms] Attestation phase: NewSshPeer - peer requested a tunnel channel for KBS
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy setting up for sid 8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy modified URL to /kbs/v0/auth of host kbs-service.trustee-operator-system:8080
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy to /kbs/v0/auth status code 200
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy modified URL to /kbs/v0/attest of host kbs-service.trustee-operator-system:8080
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy to /kbs/v0/attest status code 200
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy recovered: runtime error: invalid memory address or nil pointer dereference
2024/10/09 15:26:10 [secure-comms] Attestation phase: NewSshPeer - peer requested a tunnel channel for KBS
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy setting up for sid 8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy modified URL to /kbs/v0/resource/default/sshclient/publicKey of host kbs-service.trustee-operator-system:8080
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy to /kbs/v0/resource/default/sshclient/publicKey status code 200
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy modified URL to /kbs/v0/resource/default/pp-8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9/privateKey of host kbs-service.trustee-operator-system:8080
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy to /kbs/v0/resource/default/pp-8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9/privateKey status code 200
2024/10/09 15:26:10 [secure-comms] Attestation phase: peer reported it is upgrading to Kubernetes phase
2024/10/09 15:26:10 [secure-comms] Attestation phase: peer done by >>> chans closed <<<
2024/10/09 15:26:10 [secure-comms] Outbound KBS acceptProxy recovered: runtime error: invalid memory address or nil pointer dereference
2024/10/09 15:26:10 [secure-comms] Attestation phase: done
2024/10/09 15:26:10 [secure-comms] Kubernetes phase: starting (number of restarts 0)
2024/10/09 15:26:10 [secure-comms] Kubernetes phase: ssh connected - 192.168.122.245:2222
2024/10/09 15:26:10 [secure-comms] Kubernetes phase: ssh host key match - ssh-rsa
2024/10/09 15:26:10 [tunneler/vxlan] vxlan ppvxlan1 (remote 192.168.122.245:4789, id: 555001) created at /proc/1/task/12/ns/net
2024/10/09 15:26:10 [tunneler/vxlan] vxlan ppvxlan1 created at /proc/1/task/12/ns/net
2024/10/09 15:26:10 [secure-comms] Kubernetes phase: peer reported phase Kubernetes
2024/10/09 15:26:10 [secure-comms] Kubernetes phase: AddInbound: KATAAGENT
2024/10/09 15:26:10 [tunneler/vxlan] vxlan ppvxlan1 is moved to /var/run/netns/cni-c88622dc-b921-585c-2201-119b355563ff
2024/10/09 15:26:10 [tunneler/vxlan] Add tc redirect filters between eth0 and vxlan1 on pod network namespace /var/run/netns/cni-c88622dc-b921-585c-2201-119b355563ff
2024/10/09 15:26:10 [adaptor/proxy] Listening on /run/peerpod/pods/8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9/agent.ttrpc
2024/10/09 15:26:10 [adaptor/proxy] Trying to establish agent proxy connection to 127.0.0.1:45973
2024/10/09 15:26:10 [adaptor/proxy] established agent proxy connection to 127.0.0.1:45973
2024/10/09 15:26:10 [secure-comms] Kubernetes phase: Inbound accept: KATAAGENT
2024/10/09 15:26:10 [adaptor/cloud] agent proxy is ready
2024/10/09 15:26:10 [secure-comms] Kubernetes phase: NewInboundInstance OpenChannel opening tunnel for: KATAAGENT
2024/10/09 15:26:10 [adaptor/proxy] CreateSandbox: hostname:nginx-75d4ffc6d9-6dhgb sandboxId:8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:26:10 [adaptor/proxy]     storages:
2024/10/09 15:26:10 [adaptor/proxy]         mountpoint:/run/kata-containers/sandbox/shm source:shm fstype:tmpfs driver:ephemeral
2024/10/09 15:26:10 [adaptor/proxy] CreateContainer: containerID:8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:26:10 [adaptor/proxy]     mounts:
2024/10/09 15:26:10 [adaptor/proxy]         destination:/proc source:proc type:proc
2024/10/09 15:26:10 [adaptor/proxy]         destination:/dev source:tmpfs type:tmpfs
2024/10/09 15:26:10 [adaptor/proxy]         destination:/dev/pts source:devpts type:devpts
2024/10/09 15:26:10 [adaptor/proxy]         destination:/dev/mqueue source:mqueue type:mqueue
2024/10/09 15:26:10 [adaptor/proxy]         destination:/sys source:sysfs type:sysfs
2024/10/09 15:26:10 [adaptor/proxy]         destination:/dev/shm source:/run/kata-containers/sandbox/shm type:bind
2024/10/09 15:26:10 [adaptor/proxy]         destination:/etc/resolv.conf source:/run/kata-containers/shared/containers/8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9-0f87e1de521f68ee-resolv.conf type:bind
2024/10/09 15:26:10 [adaptor/proxy]     annotations:
2024/10/09 15:26:10 [adaptor/proxy]         io.katacontainers.pkg.oci.bundle_path: /run/containerd/io.containerd.runtime.v2.task/k8s.io/8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-memory: 0
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-cpu-quota: 0
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-cpu-shares: 2
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-id: 8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-namespace: default
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-cpu-period: 100000
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-log-directory: /var/log/pods/default_nginx-75d4ffc6d9-6dhgb_254e2600-8337-4eaf-84aa-1ebdf868db74
2024/10/09 15:26:10 [adaptor/proxy]         io.katacontainers.pkg.oci.container_type: pod_sandbox
2024/10/09 15:26:10 [adaptor/proxy]         nerdctl/network-namespace: /var/run/netns/cni-c88622dc-b921-585c-2201-119b355563ff
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-uid: 254e2600-8337-4eaf-84aa-1ebdf868db74
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-name: nginx-75d4ffc6d9-6dhgb
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.container-type: sandbox
2024/10/09 15:26:10 [adaptor/proxy]     storages:
2024/10/09 15:26:10 [adaptor/proxy]         mount_point:/run/kata-containers/8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9/rootfs source:pause fstype:overlay driver:image_guest_pull
2024/10/09 15:26:10 [adaptor/proxy] StartContainer: containerID:8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:26:10 [adaptor/proxy] CreateContainer: containerID:1bbe653f428c2f13748474ebe5feee79295b3cd8d003c2f71f71873a22973066
2024/10/09 15:26:10 [adaptor/proxy]     mounts:
2024/10/09 15:26:10 [adaptor/proxy]         destination:/proc source:proc type:proc
2024/10/09 15:26:10 [adaptor/proxy]         destination:/dev source:tmpfs type:tmpfs
2024/10/09 15:26:10 [adaptor/proxy]         destination:/dev/pts source:devpts type:devpts
2024/10/09 15:26:10 [adaptor/proxy]         destination:/dev/mqueue source:mqueue type:mqueue
2024/10/09 15:26:10 [adaptor/proxy]         destination:/sys source:sysfs type:sysfs
2024/10/09 15:26:10 [adaptor/proxy]         destination:/sys/fs/cgroup source:cgroup type:cgroup
2024/10/09 15:26:10 [adaptor/proxy]         destination:/etc/hosts source:/run/kata-containers/shared/containers/1bbe653f428c2f13748474ebe5feee79295b3cd8d003c2f71f71873a22973066-12f5e5b2f1bcf722-hosts type:bind
2024/10/09 15:26:10 [adaptor/proxy]         destination:/dev/termination-log source:/run/kata-containers/shared/containers/1bbe653f428c2f13748474ebe5feee79295b3cd8d003c2f71f71873a22973066-9711940122fd9e83-termination-log type:bind
2024/10/09 15:26:10 [adaptor/proxy]         destination:/etc/hostname source:/run/kata-containers/shared/containers/1bbe653f428c2f13748474ebe5feee79295b3cd8d003c2f71f71873a22973066-394a674c02686de5-hostname type:bind
2024/10/09 15:26:10 [adaptor/proxy]         destination:/etc/resolv.conf source:/run/kata-containers/shared/containers/1bbe653f428c2f13748474ebe5feee79295b3cd8d003c2f71f71873a22973066-b0829e6a90cf2728-resolv.conf type:bind
2024/10/09 15:26:10 [adaptor/proxy]         destination:/dev/shm source:/run/kata-containers/sandbox/shm type:bind
2024/10/09 15:26:10 [adaptor/proxy]         destination:/var/run/secrets/kubernetes.io/serviceaccount source:/run/kata-containers/shared/containers/1bbe653f428c2f13748474ebe5feee79295b3cd8d003c2f71f71873a22973066-36c5e45e47369f1d-serviceaccount type:bind
2024/10/09 15:26:10 [adaptor/proxy]     annotations:
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-id: 8b31789990572e47f30b4c5178af0b3f7ed2f3b90f26537e3569d1377987e7a9
2024/10/09 15:26:10 [adaptor/proxy]         io.katacontainers.pkg.oci.container_type: pod_container
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-name: nginx-75d4ffc6d9-6dhgb
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-namespace: default
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.image-name: docker.io/library/nginx@sha256:9700d098d545f9d2ee0660dfb155fe64f4447720a0a763a93f2cf08997227279
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.container-type: container
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.container-name: nginx
2024/10/09 15:26:10 [adaptor/proxy]         io.kubernetes.cri.sandbox-uid: 254e2600-8337-4eaf-84aa-1ebdf868db74
2024/10/09 15:26:10 [adaptor/proxy]         io.katacontainers.pkg.oci.bundle_path: /run/containerd/io.containerd.runtime.v2.task/k8s.io/1bbe653f428c2f13748474ebe5feee79295b3cd8d003c2f71f71873a22973066
2024/10/09 15:26:10 [adaptor/proxy]     storages:
2024/10/09 15:26:10 [adaptor/proxy]         mount_point:/run/kata-containers/1bbe653f428c2f13748474ebe5feee79295b3cd8d003c2f71f71873a22973066/rootfs source:docker.io/library/nginx@sha256:9700d098d545f9d2ee0660dfb155fe64f4447720a0a763a93f2cf08997227279 fstype:overlay driver:image_guest_pull
2024/10/09 15:26:20 [adaptor/proxy] StartContainer: containerID:1bbe653f428c2f13748474ebe5feee79295b3cd8d003c2f71f71873a22973066

@davidhadas davidhadas force-pushed the secComms_fix_for_0.10 branch from 5679447 to d3c1eac Compare October 9, 2024 18:54
1. Trustee Operator had changed the namespace

2. CAA had removed the SecureComms default kbs address
   Use InitData to set the kbs address instead

Signed-off-by: David Hadas <[email protected]>
@davidhadas davidhadas force-pushed the secComms_fix_for_0.10 branch from d3c1eac to c3d306c Compare October 9, 2024 19:00
@davidhadas
Copy link
Member Author

davidhadas commented Oct 9, 2024

@stevenhorsman,

SecureComms.md is now modified to include:

  1. A comment that the daemonset should be reloaded after the ConfigMap changes.
  2. The extra detail showing the trustee-operator secrets after the trustee-operator install

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and the doc fixes should resolve the issues I hit when testing this. Thanks @davidhadas!

Copy link
Member

@bpradipt bpradipt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@stevenhorsman stevenhorsman merged commit 7d19e7c into confidential-containers:main Oct 11, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants