Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update data_warm_up's document #1513

Merged
merged 2 commits into from
Mar 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/en/samples/data_warmup.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ EOF
```

`spec.dataset` specifies the target dataset that needs to be preloaded. In this example, our target is the Dataset named `spark` under the `default` namespace.
Feel free to change the configuration above if it doesn't match your actual environment
Feel free to change the configuration above if it doesn't match your actual environment. ** note ** The namespace of your DataLoad must be consistent with the namespace of your dataset.

**By default, it'll preload all the data in the target dataset**. If you'd like to controll the data preloading behaviors in a more find-grained way(e.g. preload data under some specified path only),
please refer to [DataLoad Advanced Configurations](#dataload-advanced-configurations)
Expand Down
85 changes: 85 additions & 0 deletions docs/en/samples/s3_configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
## DEMO - The special configurations required for Fluid access AWS S3

If AWS S3 is selected as the underlying storage system of Alluxio, additional configuration is required for Alluxio to properly access the mounted S3 storage system.

This document shows how to do the special configuration required for Alluxio in Fluid in a declarative manner. For more information, see [Configuring Alluxio on Amazon AWS S3](https://docs.alluxio.io/os/user/stable/en/ufs/S3.html).

## Prerequisites

- [Fluid](https://github.com/fluid-cloudnative/fluid)
- The S3 bucket has been configured and the AWS certificate that has permission to access the bucket.

Please refer to the[Fluid installation documentation](https://github.com/fluid-cloudnative/fluid/blob/master/docs/zh/userguide/install.md) to complete the installation.

## Run the example

For security, Fluid recommends using Secret to configure sensitive information such as`aws.accessKeyId` and `aws.secretKey`。For more information about Secret's use in Fluid, see[Use Secret to configure sensitive Dataset information](https://github.com/fluid-cloudnative/fluid/blob/master/docs/en/samples/use_encryptoptions.md)

**Create Dataset Resource Object**

```yaml
$ cat << EOF > dataset.yaml
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
name: my-s3
spec:
mounts:
- mountPoint: s3://<bucket-name>/<path-to-data>/
name: s3
options:
alluxio.underfs.s3.region: <s3-bucket-region>
alluxio.underfs.s3.endpoint: <s3-endpoint>
encryptOptions:
- name: aws.accessKeyId
valueFrom:
secretKeyRef:
name: mysecret
key: aws.accessKeyId
- name: aws.secretKey
valueFrom:
secretKeyRef:
name: mysecret
key: aws.secretKey
EOF
```
Note: For object storage of different cloud vendors, the region configuration must be replaced with`alluxio.underfs.s3.endpoint.region=<S3_ENDPOINT_REGION>`,For details, see [Configuring Alluxio on Amazon AWS S3](https://docs.alluxio.io/os/user/stable/en/ufs/S3.html)

```
$ kubectl create -f dataset.yaml
```

**Create Secret**

In the Secret to be created, specify the sensitive information that needs to be configured when the Dataset is created above.

```yaml
$ cat<<EOF >mysecret.yaml
apiVersion: v1
kind: Secret
metadata:
name: mysecret
stringData:
aws.accessKeyId: <AWS_ACESS_KEY_ID>
aws.secretKey: <AWS_SECRET_KEY>
EOF
```

**Create AlluxioRuntime Resource Object**

```yaml
$ cat << EOF > runtime.yaml
apiVersion: data.fluid.io/v1alpha1
kind: AlluxioRuntime
metadata:
name: my-s3
spec:
...
EOF
```

```
$ kubectl create -f runtime.yaml
```

The bucket specified in 'dataset.yaml' will be mounted to the '/s3' directory in Alluxio.
2 changes: 1 addition & 1 deletion docs/zh/samples/data_warmup.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ spec:
EOF
```

`spec.dataset`指明了需要进行数据预加载的目标数据集,在该例子中,我们的数据预加载目标为`default`命名空间下名为`spark`的数据集,如果该配置与你所在的实际环境不符,请根据你的实际环境对其进行调整。
`spec.dataset`指明了需要进行数据预加载的目标数据集,在该例子中,我们的数据预加载目标为`default`命名空间下名为`spark`的数据集,如果该配置与你所在的实际环境不符,请根据你的实际环境对其进行调整。**注意** 你的DataLoad的namespace须和你的dataset的namespace保持一致。

**默认情况下,上述DataLoad配置将会尝试加载整个数据集中的全部数据**,如果你希望进行更细粒度的控制(例如:仅加载数据集下指定路径的数据),请参考[DataLoad进阶配置](#DataLoad进阶配置)

Expand Down
3 changes: 2 additions & 1 deletion docs/zh/samples/s3_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

如果选择AWS S3作为Alluxio的底层存储系统,Alluxio需要进行额外配置,以使得Alluxio能够正常访问挂载的S3存储系统。

本文档展示了如何在Fluid中以声明式的方式完成Alluxio所需的特殊配置。更多信息请参考[在Amazon AWS S3上配置Alluxio](https://docs.alluxio.io/os/user/stable/cn/ufs/S3.html)
本文档展示了如何在Fluid中以声明式的方式完成Alluxio所需的特殊配置。更多信息请参考[在Amazon AWS S3上配置Alluxio](https://docs.alluxio.io/os/user/stable/en/ufs/S3.html)

## 前提条件

Expand Down Expand Up @@ -44,6 +44,7 @@ spec:
key: aws.secretKey
EOF
```
注意: 不同的云厂商对象存储,region的配置要替换为`alluxio.underfs.s3.endpoint.region=<S3_ENDPOINT_REGION>`,具体详细信息,参考[在Amazon AWS S3上配置Alluxio](https://docs.alluxio.io/os/user/stable/en/ufs/S3.html)

```
$ kubectl create -f dataset.yaml
Expand Down