Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileManager #2115

Merged
merged 24 commits into from
May 16, 2017
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions doc/design/file_manager/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# FileManager设计文档
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileManager or FileServer保留一个名词?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

## 目标
在本文档中,我们设计说明了名为FileManager系统,方便用户管理存放到PaddlePaddle Cloud上的文件。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

准确的说,目标是为了让用户上传自己的训练数据以进行分布式训练。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

主要功能包括:

- 提供常用的命令行管理命令管理文件和目录
- 支持的命令在[Here](./pfs/pfs.md)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here => 这里

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用中文的anchor text会报错,这是Sphinx的一个bug。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- 支持大文件的断点上传、下载

## 名词解释
- PFS:是Paddlepaddle cloud File System的简称,是对用户文件存储空间的抽象,与之相对的是Local File System。目前我们用CephFS来搭建。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paddlepaddle => PaddlePaddle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

主要是想把PFS的缩写怎么来的表达一下。
Paddlepaddle cloud File System:这个里边大写的字母刚好是PFS

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local File System => local filesystem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- [CephFS](http://docs.ceph.com/docs/master/cephfs/):一个POSIX兼容的文件系统。
- Chunk:逻辑划上文件分块的单位。
- [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/):提供七层协议的反向代理、基于粘性会话的负载均衡。
- CA:certificate authority<sup>[tls](#tls)</sup>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CA, CRT, Key 用全称就好了, 定义成术语更麻烦。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- CRT:CA signed certificate<sup>[tls](#tls)</sup>
- Key:用户私钥<sup>[tls](#tls)</sup>

## 模块

### 架构图
<image src=./src/filemanager.png width=900>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在这个图的下面增加一些用户使用的流程会清晰点?

### PFSClient
- 功能: 详细的内容看[Here](./pfs/pfs.md)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here => 这里

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- 提供用户管理文件的命令
- 用Golang写,可以跨平台执行
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Golang => Go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


- 双向验证
PFSClient需要和Ingress之间做双向验证<sup>[tls](#tls)</sup>,所以用户需要首先在`cloud.paddlepaddle.org`上注册一下,申请用户空间,并且把系统生成的Key、CRT、CA下载到本地,然后才能使用PFSClient。

### Ingress
- 功能:
提供七层协议的反向代理、基于粘性会话的负载均衡功能。

- 透传用户身份的办法
Ingress需要把PFSClient的身份头传给FileServer,配置的方法参考[Here](http://www.integralist.co.uk/posts/clientcertauth.html#3)


### FileServer
FileServer是一个用GoRPC写的HTTPServer,提供[RESTful API](./RESTAPI.md)接口,接收处理PFSClient端的文件管理请求,并且把结果返回PFSClient端。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GoRPC => Go RPC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTPServer => HTTP server

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用Go语言的标准库rpc写的server叫做RPC server,虽然它也是一个HTTP server,但是那是因为一个用户不需了解的细节——Go RPC用HTTP传输——导致的。

我不记得Go RPC支持暴露Restful API。这个Go RPC server怎么能同时是一个Restful API server呢?

这里的FileServer指的是Go 标准库里已经写好的这个吗? https://golang.org/pkg/net/http/#FileServer

Copy link
Contributor Author

@gongweibao gongweibao May 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不是,这个FileServer太简单了。
https://golang.org/pkg/net/http/#FileServer

设计的是RESTful API的接口,这样可以兼容H5。概念混淆了,没有表达清楚。


## 文件传输优化

### 分块文件传输
用户文件可能是比较大的,上传到Cloud或者下载到本地的时间可能比较长,而且在传输的过程中也可能出现网络不稳定的情况。为了应对以上的问题,我们提出了Chunk的概念,一个Chunk由所在的文件偏移、数据、数据长度及校验值组成。文件数据内容的上传和下载都是都过Chunk的操作来实现的。由于Chunk比较小(默认256K),完成一个传输动作完成的时间也比较短,不容易出错。PFSClient在传输完毕最后一个Chunk的时候检查destination文件的MD5值是否和source文件一致。

一个典型的Chunk如下所示:

```
type Chunk struct {
fileOffset int64
checksum uint32
len uint32
data []byte
}
```

### 生成sparse文件
当destination文件不存在或者大小和source文件不一致时,可以用[Fallocate](https://golang.org/pkg/syscall/#Fallocate)生成sparse文件,然后就可以并发写入多个Chunk。

### 覆盖不一致的部分
文件传输的的关键在于需要PFSClient端对比source和destination的文件Chunks的checksum是否保持一致,不一致的由PFSClient下载或者传输Chunk完成。这样已经传输成功的部分就不用重新传输了。

## 框架生成
用[swagger-api](https://github.com/swagger-api/swagger-codegen)生成Client和FileServer的框架部分,以便我们可以把更多的精力放到逻辑本身上。

## 参考文档
- <a name=tls></a>[TLS complete guide](https://github.com/k8sp/tls/blob/master/tls.md)
- [aws.s3](http://docs.aws.amazon.com/cli/latest/reference/s3/)
- [linux man document](https://linux.die.net/man/)
23 changes: 23 additions & 0 deletions doc/design/file_manager/RESTAPI.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# REST API Interface
- file

```
GET /file: Get attribue of files
POST /file: Create a file
DELETE /file: Delete a File
```

- chunk

```
GET /file/chunk: Get a chunk info
POST /file/chunk: Update a chunk
```

- dir

```
GET /dir: List all files in a directory
POST /dir: Create a directory
DELETE /dir: Delete a directory
```
45 changes: 45 additions & 0 deletions doc/design/file_manager/pfs/cp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Name
cp - copy files

# Synopsis
```
cp [-r] [-f | -n] [-v] [--preserve--links] <LocalPath> <PFSPath>
cp [-r] [-f | -n] [-v] [--preserve--links] <LocalPath> ... <PFSPath>
cp [-r] [-f | -n] [-v] [--preserve--links] <PFSPath> <LocalPath>
cp [-r] [-f | -n] [-v] [--preserve--links] <PFSPath> ... <LocalPath>
cp [-r] [-f | -n] [-v] [--preserve--links] <PFSPath> <PFSPath>
cp [-r] [-f | -n] [-v] [--preserve--links] <PFSPath> ... <PFSPath>
```

# Description
```
The following options are available:

-r
Copy directories recursively

-f
Do not prompt for confirmation before overwriting the destination path. (The -f option overrides previous -n options.)

-n
Do not overwrite an existing file. (The -n option overrides previous -f options.)

-v
Cause cp to be verbose, showing files after they are copied.

--preserve--links
Reserve links when copy links
```

# Examples
- The following command copies a single file to pfs

```
paddle pfs cp ./text1.txt /pfs/$DATACENTER/home/$USER/text1.txt
```

- The following command copies pfs file to a local file

```
paddle pfs cp /pfs/$DATACENTER/home/$USER/text1.txt ./text1.txt
```
27 changes: 27 additions & 0 deletions doc/design/file_manager/pfs/ls.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Name
ls - list directory(ies)'s contents or file(s)'s attributes

# Synopsis
`ls [-r] <PFSPath> ...`

# Description

```
The following options are available:

-r
List directory(ies) recursively
```

# Examples
- The following command lists a single file

```
paddle pfs ls /pfs/$DATACENTER/home/$USER/text1.txt
```

- The following command lists directory contents

```
paddle pfs ls / /pfs/$DATACENTER/home/$USER/folder
```
13 changes: 13 additions & 0 deletions doc/design/file_manager/pfs/mkdir.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Name
mkdir - mkdir directory(ies)

# Synopsis
`mkdir <PFSPath> ...`

# Description
Create the pfs directory(ies), if it(they) does(do) not already exist. And create intermediate directories as required.

# Examples
```
paddle pfs mkdir /pfs/$DATACENTER/home/$USER/folder
```
34 changes: 34 additions & 0 deletions doc/design/file_manager/pfs/mv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Name
mv - move (rename) files


# Synopsis
```
mv [-f | -n] [-v] <LocalPath> <PFSPath>
mv [-f | -n] [-v] <LocalPath> ... <PFSPath>
mv [-f | -n] [-v] <PFSPath> <LocalPath>
mv [-f | -n] [-v] <PFSPath> ... <LocalPath>
mv [-f | -n] [-v] <PFSPath> <PFSPath>
mv [-f | -n] [-v] <PFSPath> ... <PFSPath>
```

# Description
```
The following options are available:

-f
Do not prompt for confirmation before overwriting the destination path. (The -f option overrides previous -n options.)

-n
Do not overwrite an existing file. (The -n option overrides previous -f options.)

-v
Cause mv to be verbose, showing files after they are moved.
```

# Examples
- The following command moves a single file to pfs

```
paddle pfs mv ./text1.txt /pfs/$DATACENTER/home/$USER/text1.txt
```
42 changes: 42 additions & 0 deletions doc/design/file_manager/pfs/pfs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# PFS Client

## Description
The `pfs` command is a Command Line Interface to manage your files on PaddlePaddle Cloud

## Synopsis
```
paddle [options] pfs <subcommand> [parameters]
```

## Options
```
--profile (string)
Use a specific profile from your credential file.

--help (string)
Display more information about command

--version
Output version information and exit

--debug
Show detailed debugging log

--only-show-errors (boolean)
Only errors and warnings are displayed. All other output is suppressed.
```

## Path Arguments
When using a command, we need to specify path arguments. There are two path argument type: `localpath` and `pfspath`.
A `pfspath` begin with `/pfs`, eg: `/pfs/$DATACENTER/home/$USER/folder`.

## order of Path Arguments
Commonly, if there are two path arguments, the first is the source, and the second is the destination.

## Subcommonds
- [rm](rm.md)
- [mv](mv.md)
- [cp](cp.md)
- [ls](ls.md)
- [mkdir](mkdir.md)
- [sync](sync.md)
32 changes: 32 additions & 0 deletions doc/design/file_manager/pfs/rm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Name
rm - remove files or directories

# Synopsis
```
rm [-r] [-v] <PFSPath> ...
```

# Description

```
The following options are available:

-r
remove directories and their contents recursively

-v
Cause rm to be verbose, showing files after they are removed.
```

# Examples
- The following command deletes a single file:

```
paddle pfs rm /pfs/$DATACENTER/home/$USER/test1.txt
```

- The following command deletes a directory recursively:

```
paddle pfs rm -r /pfs/$DATACENTER/home/$USER/folder
```
34 changes: 34 additions & 0 deletions doc/design/file_manager/pfs/sync.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Name
sync - sync directories. Recursively copies new and updated files from the source directory to the destination.

# Synopsis
```
sync [--preserve--links] [-v] <LocalPath> <PFSPath>
sync [--preserve--links] [-v] <PFSPath> <LocalPath>
sync [--preserve--links] [-v] <PFSPath> <PFSPath>`
```

# Description

```
The following options are available:

--preserve--links
Reserve links when copy links.

-v
Cause sync to be verbose, showing files after their's synchronization is complete.
```

# Examples
- The following command sync locally directory to pfs.

```
paddle pfs sync ./dir1 /pfs/$DATACENTER/home/$USER/mydir1
```

- The following command sync pfs directory to local.

```
paddle pfs sync /pfs/$DATACENTER/home/$USER/mydir1 .
```
Binary file added doc/design/file_manager/src/filemanager.graffle
Binary file not shown.
Binary file added doc/design/file_manager/src/filemanager.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.