Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EPIC]: Blob-CSI implementation & Boathouse replacement #1001

Closed
57 of 63 tasks
blairdrummond opened this issue Apr 13, 2022 · 1 comment
Closed
57 of 63 tasks

[EPIC]: Blob-CSI implementation & Boathouse replacement #1001

blairdrummond opened this issue Apr 13, 2022 · 1 comment
Assignees
Labels
kind/epic An epic kind/feature New feature or request

Comments

@blairdrummond
Copy link
Contributor

blairdrummond commented Apr 13, 2022

Based on PoC work here

https://github.com/blairdrummond/kind-blob-csi-mock-aaw

Architecture

The architecture has a nice split:

  • Create or Find Storage Containers, Create PVs (readwrite or readonly), and create PVCs in the user namespace

  • Automatically discover PVCs and bind them to Notebooks, Workflows, or S3Proxy pods. (Goofys Webhook)

Refreshes/Updates

  • Users will need to restart their notebooks to trigger refresh of mounts (in the situation where an FDI bucket gets added, for example).

  • Recommend that S3Proxy comes with a CronJob that rolls out a restart on an interval. (Similar to the AAD Pod refresher.)

TODO: What happens if the storage account credentials get rotated?

BlobCSI controller ✔️

The BlobCSI Profile Controller loops through profiles and AAW Storage accounts:

  • Creates a container (bucket) in each storage account if it doesn't exist.
  • Creates a PersistentVolume binding to it using the Azure Blob CSI Driver. The driver authenticates using a secret within the azure-blob-csi-system
  • Creates a PersistentVolumeClaim binding to the PersistentVolume in the profile's namespace.[^A Gatekeeper policy is in place to ensure that no other PVCs are allowed to bind to this PV]

FDI Submodule ❌

Use OPA sdk to check if a user has access to a given bucket, and determine what permissions they have. (We can only implement RW/RO)

Gatekeeper Policies ✔️

The PersistentVolumes are created with a profile label, matching the users profile. A Gatekeeper policy ensure that pvc.metadata.namespace == pv.metadata.labels.profile. So that users cannot bind other users volumes. This is resolved using a claimRef on the PV.

Also, the classification of the PV and PVCs must match. (Still need to check this.)
Alternatively, prevent users from creating these PVCs themselves.

Goofys Blob CSI Injector ✔️

The Goofys injector is repurposed:

  • Instead of using a fixed list of mounts, it uses the blob.aaw.statcan.gc.ca/automount label to select volumes to mount.
  • It further differentiates between unclassified and protected-b PVCs, only mounting to the correct pods.

Tasks

PoC

Setup

Testing

Switch-over/Teardown

S3Proxy

FDI integration

Production Deployment

RMI

Docs

UX

@blairdrummond blairdrummond added the kind/feature New feature or request label Apr 13, 2022
@Collinbrown95
Copy link
Contributor

Collinbrown95 commented May 12, 2022

IMPORTANT

Deprecating old profiles controller

  • when we're deprecating MinIO, we should also do the migration of the old profiles controller - there will only be a few components left, so we can use this opportunity to migrate the various functionality over to aaw-kubeflow-profiles-controller.

Roadmap to Deprecation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/epic An epic kind/feature New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants