Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add working directory to handle the data in different directories #184

Merged
merged 5 commits into from
May 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,12 +136,14 @@ clientSettings:

### Files

The following two configurations are related to the log and data files:
The following three configurations are related to the log and data files:

* `workingDir`: **Optional**. If you have multiple directories containing data with the same file structure, you can use this parameter to switch between them. For example, the value of `path` and `failDataPath` of the configuration below will be automatically changed to `./data/student.csv` and `./data/err/student.csv`. If you change workingDir to `./data1`, the path will be changed accordingly. The param can be either absolute or relative.
* `logPath`: **Optional**. Specifies the log path when importing data. The default path is `/tmp/nebula-importer-{timestamp}.log`.
* `files`: **Required**. It is an array type to configure different data files. You can also import data from a HTTP link by inputting the link in the file path.

```yaml
workingDir: ./data/
logPath: ./err/test.log
files:
- path: ./student.csv
Expand Down
4 changes: 3 additions & 1 deletion README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,12 +111,14 @@ clientSettings:

### 文件

跟日志和数据文件相关的配置跟以下两个选项有关
跟日志和数据文件相关的配置跟以下三个选项有关

- `workingDir`: **可选**。如果有多个文件夹,里面有相同文件结构的数据,可以使用这个参数在多个文件夹中切换。比如对于下面代码块的配置来说,`path`和`failDataPath`的值会被自动替换成`./data/student.csv`和`./data/err/student.csv`,如果把`workingDir`换成`./data1`,这两个值也会做相应改变。这个参数可以是绝对路径,也可以是相对路径。
- `logPath`:**可选**。指定导入过程中的错误等日志信息输出的文件路径,默认输出到 `/tmp/nebula-importer-{timestamp}.log` 中。
- `files`:**必填**。数组类型,用来配置不同的数据文件。您也可以从 HTTP 链接导入数据,在文件路径中输入链接即可。

```yaml
workingDir: ./data/
logPath: ./err/test.log
files:
- path: ./student.csv
Expand Down
7 changes: 7 additions & 0 deletions examples/v2/data/course.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
x101,Math,3,No5
y102,English,6,No11
"z103",Chinese,1,No1
0test,Test,2,No2
00test,Test2,4,No3
"000test",中国(  ),5,No10
"0000test",中国( ),7,No10
44 changes: 44 additions & 0 deletions examples/v2/example_with_working_dir.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
version: v2
description: example
removeTempFiles: false
clientSettings:
retry: 3
concurrency: 2 # number of graph clients
channelBufferSize: 1
space: importer_test_working_dir
connection:
user: root
password: nebula
address: graphd1:9669,graphd2:9669
postStart:
commands: |
DROP SPACE IF EXISTS importer_test_working_dir;
CREATE SPACE IF NOT EXISTS importer_test_working_dir(partition_num=1, replica_factor=1, vid_type=FIXED_STRING(10));
USE importer_test_working_dir;
CREATE TAG course(name string, credits int);
afterPeriod: 8s
workingDir: ./data/
logPath: ./err/test.log
files:
- path: ./course.csv
failDataPath: ./err/course.csv
batchSize: 2
inOrder: true
type: csv
csv:
withHeader: false
withLabel: false
schema:
type: vertex
vertex:
tags:
- name: course
props:
- name: name
type: string
- name: credits
type: int
- name: building
props:
- name: name
type: string
10 changes: 10 additions & 0 deletions pkg/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ type YAMLConfig struct {
RemoveTempFiles *bool `json:"removeTempFiles" yaml:"removeTempFiles"` // from v1
NebulaClientSettings *NebulaClientSettings `json:"clientSettings" yaml:"clientSettings"`
LogPath *string `json:"logPath" yaml:"logPath"`
WorkingDirectory *string `json:"workingDir" yaml:"workingDir"`
Files []*File `json:"files" yaml:"files"`
}

Expand Down Expand Up @@ -148,6 +149,15 @@ func Parse(filename string, runnerLogger *logger.RunnerLogger) (*YAMLConfig, err
return nil, ierrors.Wrap(ierrors.InvalidConfigPathOrFormat, err)
}
path := filepath.Dir(abs)

if workingDir := conf.WorkingDirectory; workingDir != nil && len(*workingDir) > 0 {
if !filepath.IsAbs(*workingDir) {
path = filepath.Join(path, *workingDir)
} else {
path = *workingDir
}
}

if err = conf.ValidateAndReset(path, runnerLogger); err != nil {
return nil, ierrors.Wrap(ierrors.ConfigError, err)
}
Expand Down