Skip to content

Commit

Permalink
Merge branch 'master' into container-network
Browse files Browse the repository at this point in the history
Signed-off-by: mahao<[email protected]>
  • Loading branch information
allenhaozi committed May 2, 2022
2 parents a1ec38e + 3218390 commit cae256c
Show file tree
Hide file tree
Showing 6 changed files with 227 additions and 13 deletions.
2 changes: 1 addition & 1 deletion charts/fluid/fluid/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ version: 0.8.0

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application.
appVersion: 0.8.0-79d0b30
appVersion: 0.8.0-2073b38
home: https://github.com/fluid-cloudnative/fluid
keywords:
- category:data
Expand Down
22 changes: 11 additions & 11 deletions charts/fluid/fluid/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ workdir: /tmp

dataset:
controller:
image: fluidcloudnative/dataset-controller:v0.8.0-79d0b30
image: fluidcloudnative/dataset-controller:v0.8.0-2073b38

csi:
featureGates: "FuseRecovery=false"
Expand All @@ -15,7 +15,7 @@ csi:
registrar:
image: registry.aliyuncs.com/acs/csi-node-driver-registrar:v1.2.0
plugins:
image: fluidcloudnative/fluid-csi:v0.8.0-79d0b30
image: fluidcloudnative/fluid-csi:v0.8.0-2073b38
kubelet:
rootDir: /var/lib/kubelet

Expand All @@ -28,9 +28,9 @@ runtime:
portRange: 20000-26000
enabled: true
init:
image: fluidcloudnative/init-users:v0.8.0-79d0b30
image: fluidcloudnative/init-users:v0.8.0-2073b38
controller:
image: fluidcloudnative/alluxioruntime-controller:v0.8.0-79d0b30
image: fluidcloudnative/alluxioruntime-controller:v0.8.0-2073b38
runtime:
image: registry.aliyuncs.com/alluxio/alluxio:release-2.7.2-SNAPSHOT-3714f2b
fuse:
Expand All @@ -44,36 +44,36 @@ runtime:
fuse:
image: registry.cn-shanghai.aliyuncs.com/jindofs/jindo-fuse:3.8.0
controller:
image: fluidcloudnative/jindoruntime-controller:v0.8.0-79d0b30
image: fluidcloudnative/jindoruntime-controller:v0.8.0-2073b38
init:
portCheck:
enabled: false
image: fluidcloudnative/init-users:v0.8.0-79d0b30
image: fluidcloudnative/init-users:v0.8.0-2073b38
goosefs:
runtimeWorkers: 3
portRange: 26000-32000
enabled: false
init:
image: fluidcloudnative/init-users:v0.8.0-79d0b30
image: fluidcloudnative/init-users:v0.8.0-2073b38
controller:
image: fluidcloudnative/goosefsruntime-controller:v0.8.0-79d0b30
image: fluidcloudnative/goosefsruntime-controller:v0.8.0-2073b38
runtime:
image: ccr.ccs.tencentyun.com/qcloud/goosefs:v1.2.0
fuse:
image: ccr.ccs.tencentyun.com/qcloud/goosefs-fuse:v1.2.0
juicefs:
enabled: false
controller:
image: fluidcloudnative/juicefsruntime-controller:v0.8.0-79d0b30
image: fluidcloudnative/juicefsruntime-controller:v0.8.0-2073b38
fuse:
image: registry.cn-hangzhou.aliyuncs.com/juicefs/juicefs-fuse:v1.0.0-beta2

webhook:
enabled: true
image: fluidcloudnative/fluid-webhook:v0.8.0-79d0b30
image: fluidcloudnative/fluid-webhook:v0.8.0-2073b38
replicas: 1

fluidapp:
enabled: true
controller:
image: fluidcloudnative/application-controller:v0.8.0-79d0b30
image: fluidcloudnative/application-controller:v0.8.0-2073b38
2 changes: 2 additions & 0 deletions docs/en/TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@
- [Machine Learning](samples/machinelearning.md)
+ Advanced
- [Alluxio Tieredstore Configuration](samples/tieredstore_config.md)
+ Serverless
- [How to ensure the completion of serverless tasks](samples/application_controller.md)
- [How to enable FUSE auto-recovery](samples/fuse_recover.md)
+ Troubleshooting
- [Collecting logs](userguide/troubleshooting.md)
Expand Down
108 changes: 108 additions & 0 deletions docs/en/samples/application_controller.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Demo - How to ensure the completion of Fluid's serverless tasks

## Background

In the serverless scenario, Workload such as Job, when the user container of the Pod completes the task and exits, the
Fuse Sidecar can also actively exit.
This enables the Job Controller to correctly determine the completion status of the Pod. However, the fuse container
itself does not have an exit mechanism, and the Fluid Application Controller will detect the pods with the fluid label
in the cluster.
After the user container exits, the fuse container is exited normally to reach the state where the job is completed.

## Installation

You can download the latest Fluid installation package
from [Fluid Releases](https://github.com/fluid-cloudnative/fluid/releases).
Refer to the [Installation Documentation](../userguide/install.md) to complete the installation. And check that the
components of Fluid are running normally (here takes JuiceFSRuntime as an example):

```shell
$ kubectl -n fluid-system get po
NAME READY STATUS RESTARTS AGE
dataset-controller-86768b56fb-4pdts 1/1 Running 0 36s
fluid-webhook-f77465869-zh8rv 1/1 Running 0 62s
fluidapp-controller-597dbd77dd-jgsbp 1/1 Running 0 81s
juicefsruntime-controller-65d54bb48f-vnzpj 1/1 Running 0 99s
```

Typically, you will see a Pod named `dataset-controller`, a Pod named `juicefsruntime-controller`, a Pod
named `fluid-webhook` and a Pod named `fluidapp-controller`.

## Demo

**Enable webhook for namespace**

Fluid webhook provides the ability to inject FUSE sidecars for pods in serverless scenarios. To enable this function, you need to set label `fluid.io/enable-injection=true` in the corresponding namespace. The operation is as follows:

```shell
$ kubectl patch ns default -p '{"metadata": {"labels": {"fluid.io/enable-injection": "true"}}}'
namespace/default patched
$ kubectl get ns default --show-labels
NAME STATUS AGE LABELS
default Active 4d12h fluid.io/enable-injection=true,kubernetes.io/metadata.name=default
```

**Create dataset and runtime**

Create corresponding Runtime resources and Datasets with the same name for different types of runtimes. Take JuiceFSRuntime as an example here. For details, please refer to [Documentation](juicefs_runtime.md), as follows:

```shell
$ kubectl get juicefsruntime
NAME WORKER PHASE FUSE PHASE AGE
jfsdemo Ready Ready 2m58s
$ kubectl get dataset
NAME UFS TOTAL SIZE CACHED CACHE CAPACITY CACHED PERCENTAGE PHASE AGE
jfsdemo [Calculating] N/A N/A Bound 2m55s
```

**Create Job**

To use Fluid in a serverless scenario, you need to add the `serverless.fluid.io/inject: "true"` label to the application pod. as follows:

```yaml
$ cat<<EOF >sample.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: demo-app
spec:
template:
metadata:
labels:
serverless.fluid.io/inject: "true"
spec:
containers:
- name: demo
image: busybox
args:
- -c
- echo $(date -u) >> /data/out.txt
command:
- /bin/sh
volumeMounts:
- mountPath: /data
name: demo
restartPolicy: Never
volumes:
- name: demo
persistentVolumeClaim:
claimName: jfsdemo
backoffLimit: 4
EOF
$ kubectl create -f sample.yaml
job.batch/demo-app created
```

**Check if the Pod is completed**

```shell
$ kubectl get job
NAME COMPLETIONS DURATION AGE
demo-app 1/1 14s 46s
$ kubectl get po
NAME READY STATUS RESTARTS AGE
demo-app-wdfr8 0/2 Completed 0 25s
jfsdemo-worker-0 1/1 Running 0 14m
```

It can be seen that the job has been completed, and its pod has two containers, both of which have been completed.
4 changes: 3 additions & 1 deletion docs/zh/TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,10 @@
+ 进阶使用
- [AlluxioRuntime分层存储配置](samples/tieredstore_config.md)
- [通过Webhook机制优化Pod调度](operation/pod_schedule_global.md)
- [如何在Knative环境运行](samples/knative.md)
- [如何开启 FUSE 自动恢复能力](samples/fuse_recover.md)
+ 无服务器场景
- [如何在Knative环境运行](samples/knative.md)
- [如何保障 erverless 任务顺利完成](samples/application_controller.md)
+ 工作负载
- [机器学习](samples/machinelearning.md)
+ 更多Runtime实现
Expand Down
102 changes: 102 additions & 0 deletions docs/zh/samples/application_controller.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# 示例 - 如何保障 Fluid 的 Serverless 任务顺利完成

## 背景介绍

在 Serverless 场景中, Job 等 Workload,当 Pod 的 user container 完成任务并退出后,需要 Fuse Sidecar 也可以主动退出,
从而使 Job Controller 能够正确判断 Pod 所处的完成状态。然而,fuse container 自身并没有退出机制,Fluid Application Controller 会检测集群中带 fluid label 的 Pod,
在 user container 退出后,将 fuse container 正常退出,以达到 Job 完成的状态。

## 安装

您可以从 [Fluid Releases](https://github.com/fluid-cloudnative/fluid/releases) 下载最新的 Fluid 安装包。
再参考 [安装文档](../userguide/install.md) 完成安装。并检查 Fluid 各组件正常运行(这里以 JuiceFSRuntime 为例):

```shell
$ kubectl -n fluid-system get po
NAME READY STATUS RESTARTS AGE
dataset-controller-86768b56fb-4pdts 1/1 Running 0 36s
fluid-webhook-f77465869-zh8rv 1/1 Running 0 62s
fluidapp-controller-597dbd77dd-jgsbp 1/1 Running 0 81s
juicefsruntime-controller-65d54bb48f-vnzpj 1/1 Running 0 99s
```

通常来说,你会看到一个名为 `dataset-controller` 的 Pod、一个名为 `juicefsruntime-controller` 的 Pod、一个名为 `fluid-webhook` 的 Pod 和一个名为 `fluidapp-controller` 的 Pod。

## 运行示例

**为 namespace 开启 webhook**

Fluid webhook 提供了在 Serverless 场景中为 pod 注入 FUSE Sidecar 的功能,为了开启该功能,需要将对应的 namespace 打上 `fluid.io/enable-injection=true` 的标签。操作如下:

```shell
$ kubectl patch ns default -p '{"metadata": {"labels": {"fluid.io/enable-injection": "true"}}}'
namespace/default patched
$ kubectl get ns default --show-labels
NAME STATUS AGE LABELS
default Active 4d12h fluid.io/enable-injection=true,kubernetes.io/metadata.name=default
```

**创建 dataset 和 runtime**

针对不同类型的 runtime 创建相应的 Runtime 资源,以及同名的 Dataset。这里以 JuiceFSRuntime 为例,具体可参考 [文档](juicefs_runtime.md),如下:

```shell
$ kubectl get juicefsruntime
NAME WORKER PHASE FUSE PHASE AGE
jfsdemo Ready Ready 2m58s
$ kubectl get dataset
NAME UFS TOTAL SIZE CACHED CACHE CAPACITY CACHED PERCENTAGE PHASE AGE
jfsdemo [Calculating] N/A N/A Bound 2m55s
```

**创建 Job 资源对象**

在 Serverless 场景使用 Fluid,需要在应用 Pod 中添加 `serverless.fluid.io/inject: "true"` label。如下:

```yaml
$ cat<<EOF >sample.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: demo-app
spec:
template:
metadata:
labels:
serverless.fluid.io/inject: "true"
spec:
containers:
- name: demo
image: busybox
args:
- -c
- echo $(date -u) >> /data/out.txt
command:
- /bin/sh
volumeMounts:
- mountPath: /data
name: demo
restartPolicy: Never
volumes:
- name: demo
persistentVolumeClaim:
claimName: jfsdemo
backoffLimit: 4
EOF
$ kubectl create -f sample.yaml
job.batch/demo-app created
```

**查看 job 是否完成**

```shell
$ kubectl get job
NAME COMPLETIONS DURATION AGE
demo-app 1/1 14s 46s
$ kubectl get po
NAME READY STATUS RESTARTS AGE
demo-app-wdfr8 0/2 Completed 0 25s
jfsdemo-worker-0 1/1 Running 0 14m
```

可以看到,job 已经完成,其 pod 有两个 container,均已完成。

0 comments on commit cae256c

Please sign in to comment.