Skip to content

Commit

Permalink
First pass at instructions using ACM
Browse files Browse the repository at this point in the history
* This is a first pass at coming up with a recipe for deploying
  Kubeflow using ACM and GitOps.

* There's lots of friction but I was able to get it to work.

* Related to GoogleCloudPlatform#4
  • Loading branch information
Jeremy Lewi committed May 21, 2020
1 parent 2e22661 commit 807f54f
Show file tree
Hide file tree
Showing 2 changed files with 170 additions and 5 deletions.
16 changes: 16 additions & 0 deletions kubeflow/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ KF_DIR=./instance/kustomize
APP_DIR=.
MANIFESTS_DIR=./upstream/manifests

ACM_KF_REPO=acm-repo

# TODO(https://github.com/GoogleContainerTools/kpt/issues/539):
# Using a subdirectory fo the current directory breaks our ability to run kpt set .
# So as a hack we use a $(BUILD_DIR)/ directory in the parent directory.
Expand Down Expand Up @@ -129,6 +131,20 @@ hydrate-kubeflow:
kustomize build --load_restrictor none -o $(BUILD_DIR)/metacontroller $(KF_DIR)/metacontroller
mkdir -p $(BUILD_DIR)/kubeflow-issuer
kustomize build --load_restrictor none -o $(BUILD_DIR)/kubeflow-issuer $(KF_DIR)/kubeflow-issuer

#*****************************************************************************************
# Hydrate ACM repos
# These commands copy the configs to the appropriate acm repo
acm-gcp: hydrate-gcp
cp -r $(BUILD_DIR)/gcp_config $(ACM_MGMT_REPO)/namespaces/$(PROJECT)

acm-kubeflow: hydrate-asm hydrate-kubeflow
rm -rf $(ACM_KF_REPO)
mkdir -p $(ACM_KF_REPO)
find $(BUILD_DIR) -name "*.yaml" -not -path "*/gcp_config/**" -exec cp {} $(ACM_KF_REPO)/ ";"

#*****************************************************************************************

.PHONY: clean-build
clean-build:
# Delete build because we want to prune any resources which are no longer defined in the manifests
Expand Down
159 changes: 154 additions & 5 deletions kubeflow/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,8 @@ one if you haven't already.
1. Fetch the blueprint

```
kpt pkg get https://github.com/jlewi/kf-templates-gcp.git/kubeflow@master ./
kpt pkg get https://github.com/kubeflow/gcp-blueprints.git/kubeflow@master ./
```

* TODO(jlewi): Change to a Kubeflow repo


1. Change to the kubeflow directory

```
Expand Down Expand Up @@ -193,3 +189,156 @@ one if you haven't already.
```

* For more info refer to the instructions about enabling services.

## GitOps(Work In Progress): Using Anthos Config Managment to Install and Manage Kubeflow

### Setting up ACM to manage your project.

You must setup an ACM cluster to manage your project. Typically this entails the following

* Create a management cluster
* Typically you will want this to be in a different project since it will manage
multiple projects and have admin privileges that consuemrs of those projects
shouldn't have

* Follow the instructions to install ACM on that cluster

* You will also need to install Cloud Config Connector. Starting with Anthos 1.4
you can use ACM to install Cloud Config Connector. Earlier versions of
ACM install a version of Cloud Config COnnector that is to old for Kubeflow.

* In your ACM repo setup a namespace corresponding to the project you will install
Kubeflow into.

### Deploying Kubeflow

1. Follow the steps in the previous section to configure and hydrate the manifests but do
not **apply** the manifests.


1. Enable services

```
make apply-services
```

* TODO(jlewi): Can we use CNRM and ACM for this as well.

1. Hydrate the manifests

```
make hydrate
```

1. Copy the gcp config resources to the ACM repo that is being used to manage the project
where Kubeflow is being deployed


```
cp $(BUILD_DIR)/gcp_config $(ACM_MGMT_REPO)/namespaces/$(PROJECT)
```

1. Wait for your KF cluster to be deployed

1. Create a context for your new cluster

```
make create-ctxt
```

1. Create a directory to use as your ACM repo

```
mkdir acm-repo
```

* **Important** We currently use an unstructured ACM repository because we don't have a good way
of reorganizing our K8s resources and files according to the layout required by structured repositories
e.g. we have files with cluster scoped and namespace scoped resources.

1. Follow the ACM docs to install and configure the ACM operator on your cluster

* Use a structured repo
* Do not configure ACM to install Cloud Config Connector


1. Hydrate the configs to be deployed on the management cluster via ACM

```
make acm-gcp
```

* Commit and push those configs

1. Wait for the Kubernetes cluster to be created

1. Create a context

```
make create-ctxt
```

1. Set Client ID and Client secret for IAP OAuth

```
export CLIENT_ID=
export CLIENT_SECRET=
```

1. Hydrate the configs to be deployed on the kubeflow cluster via ACM

```
make acm-kubeflow
```

1. TODO(fix/test these instructions) Run the custom transform to remove the namespace from cluster scoped resoruces

```
~/git_kustomize/kustomize/kustomize config run --enable-exec --exec-path ~/git_kubeflow-kfctl/kustomize-fns/remove-namespace/remove-namespace ./acm-repo/ --stack-trace
```

* Relevant issues:

* https://github.com/kubeflow/gcp-blueprints/issues/27
* https://github.com/kubernetes-sigs/kustomize/issues/2498

1. Remove `acm-repo/~g_v1_service_istio-ingressgateway.yaml`
1. Open acm-repo/IngressGateway.yaml and add to the service `istio-ingressgateway` the annotations

```
annotations:
beta.cloud.google.com/backend-config: '{"ports": {"http2":"iap-backendconfig"}}'
```
* This is a workaround for https://github.com/kubeflow/gcp-blueprints/issues/22

1. In gcloud console find the backend associated with the ingressgateway and change the health check

* Set port to the node port mapped to the istioingressgateway status port
* Set health check path to /healthz/ready
* Relevent instructions https://cloud.google.com/service-mesh/docs/iap-integration

* TODO(https://github.com/kubeflow/gcp-blueprints/issues/14) Automate this

1. Find the IAP audience for your ingress and update `acm-repo/authentication.istio.io_v1alpha1_policy_ingress-jwt.yaml`
with it

* TODO(https://github.com/kubeflow/gcp-blueprints/issues/14): Come up with a better solution

1. Commit and push those configs

1. Check the status of the sync

```
nomos --contexts=${KUBEFLOW_CONTEXT} status
```

1. Wait for the istio-system to be created then create the iap secret

```
make iap-secret
```

* Note this is time sensitive; the ingress won't be created until the secret exists
* The GKE ManagedCertificate can't be provisioned if the ingress doesn't exist
* If the endpoint doesn't become available within O(5) minutes the GKEManagedCertificate will give up
on trying to provision the certificate.

0 comments on commit 807f54f

Please sign in to comment.