Skip to content

Commit

Permalink
add exchange ent (#932)
Browse files Browse the repository at this point in the history
* add exchange ent

* add exchange ent docs

* Update ex-ug-compile.md

* add parameters and fix details

* Update ex-ug-what-is-exchange.md
  • Loading branch information
izhuxiaoqing authored Nov 19, 2021
1 parent c2f32d5 commit c600409
Show file tree
Hide file tree
Showing 6 changed files with 215 additions and 10 deletions.
4 changes: 2 additions & 2 deletions docs-2.0/nebula-exchange/about-exchange/ex-ug-limitations.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

This topic describes some of the limitations of using Exchange 2.x.

## Nebula Graph releases
## Version compatibility

The correspondence between the Nebula Exchange release (the JAR version) and the Nebula Graph release is as follows.
The correspondence between the Nebula Exchange release (the JAR version) and the Nebula Graph core release is as follows.

|Exchange client|Nebula Graph|
|:---|:---|
Expand Down
18 changes: 16 additions & 2 deletions docs-2.0/nebula-exchange/about-exchange/ex-ug-what-is-exchange.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ Exchange consists of Reader, Processor, and Writer. After Reader reads data from

![Nebula Graph® Exchange consists of Reader, Processor, and Writer that can migrate data from a variety of formats and sources to Nebula Graph](../figs/ex-ug-003.png)

## Editions

Exchange has two editions, the Community Edition and the Enterprise Edition. The Community Edition is open source developed on [GitHub](https://github.com/vesoft-inc/nebula-exchange). The Enterprise Edition supports not only the functions of the Community Edition but also adds additional features. For details, see [Comparisons](https://nebula-graph.com.cn/pricing/).

## Scenarios

Exchange applies to the following scenarios:
Expand All @@ -16,6 +20,12 @@ Exchange applies to the following scenarios:

- A large volume of data needs to be generated into SST files that Nebula Graph can recognize and then imported into the Nebula Graph database.

- The data saved in Nebula Graph needs to be exported.

!!! enterpriseonly

Exporting the data saved in Nebula Graph is supported by Exchange Enterprise Edition only.

## Advantages

Exchange has the following advantages:
Expand All @@ -24,6 +34,8 @@ Exchange has the following advantages:

- SST import: It supports converting data from different sources into SST files for data import.

- SSL encryption: It supports establishing the SSL encryption between Exchange and Nebula Graph to ensure data security.

- Resumable data import: It supports resumable data import to save time and improve data import efficiency.

!!! note
Expand All @@ -40,7 +52,7 @@ Exchange has the following advantages:

## Data source

Exchange {{exchange.release}} supports converting data from the following formats or sources into vertexes and edges that Nebula Graph can recognize, and then importing them into Nebula Graph in the form of **nGQL** statements:
Exchange {{exchange.release}} supports converting data from the following formats or sources into vertexes and edges that Nebula Graph can recognize, and then importing them into Nebula Graph in the form of nGQL statements:

- Data stored in HDFS or locally:
- [Apache Parquet](../use-exchange/ex-ug-import-from-parquet.md)
Expand All @@ -65,4 +77,6 @@ Exchange {{exchange.release}} supports converting data from the following format

- Publish/Subscribe messaging platform: [Apache Pulsar 2.4.5](../use-exchange/ex-ug-import-from-pulsar.md)

In addition to importing data as nGQL statements, Exchange supports generating **SST files** for data sources and then [importing SST](../use-exchange/ex-ug-import-from-sst.md) files via Console.
In addition to importing data as nGQL statements, Exchange supports generating SST files for data sources and then [importing SST](../use-exchange/ex-ug-import-from-sst.md) files via Console.

In addition, Exchange Enterprise Edition also supports [exporting data to a CSV file](../use-exchange/ex-ug-export-from-nebula.md) using Nebula Graph as data sources.
22 changes: 18 additions & 4 deletions docs-2.0/nebula-exchange/ex-ug-compile.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,22 @@
# Compile Exchange
# Get Exchange

This topic describes how to compile Nebula Exchange. Users can also [download](https://repo1.maven.org/maven2/com/vesoft/nebula-exchange/) the compiled `.jar` file directly.
This topic introduces how to get the JAR file of Nebula Exchange.

## Prerequisites
## Download the JAR file directly

The JAR file of Exchange Community Edition can be [downloaded](https://repo1.maven.org/maven2/com/vesoft/nebula-exchange/) directly.

To download Exchange Enterprise Edition, [get Nebula Graph Enterprise Edition Package](https://nebula-graph.com.cn/pricing/) first.

## Get the JAR file by compiling the source code

You can get the JAR file of Exchange Community Edition by compiling the source code. The following introduces how to compile the source code of Exchange.

!!! enterpriseonly

You can get Exchange Enterprise Edition in Nebula Graph Enterprise Edition Package only.

### Prerequisites

- Install [Maven](https://maven.apache.org/download.cgi).

Expand Down Expand Up @@ -58,7 +72,7 @@ In the `target` directory, users can find the `exchange-2.x.y.jar` file.

When migrating data, you can refer to configuration file [`target/classes/application.conf`](https://github.com/vesoft-inc/nebula-exchange/blob/master/nebula-exchange/src/main/resources/application.conf).

## Failed to download the dependency package
### Failed to download the dependency package

If downloading dependencies fails when compiling:

Expand Down
30 changes: 29 additions & 1 deletion docs-2.0/nebula-exchange/parameter-reference/ex-ug-parameter.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,14 @@ Users only need to configure parameters for connecting to Hive if Spark and Hive
|`nebula.user`|string|-|Yes|The username with write permissions for Nebula Graph.|
|`nebula.pswd`|string|-|Yes|The account password.|
|`nebula.space`|string|-|Yes|The name of the graph space where data needs to be imported.|
|`nebula.ssl.enable.graph`|bool|`false`|Yes|Enables the [SSL encryption](https://en.wikipedia.org/wiki/Transport_Layer_Security) between Exchange and Graph services. If the value is `true`, the SSL encryption is enabled and the following SSL parameters take effect. If Exchange is run on a multi-machine cluster, you need to store the corresponding files in the same path on each machine when setting the following SSL-related paths.|
|`nebula.ssl.sign`|string|`ca`|Yes|Specifies the SSL sign. Optional values are `ca` and `self`.|
|`nebula.ssl.ca.param.caCrtFilePath`|string|Specifies the storage path of the CA certificate. It takes effect when the value of `nebula.ssl.sign` is `ca`.|
|`nebula.ssl.ca.param.crtFilePath`|string|`"/path/crtFilePath"`|Yes|Specifies the storage path of the CRT certificate. It takes effect when the value of `nebula.ssl.sign` is `ca`.|
|`nebula.ssl.ca.param.keyFilePath`|string|`"/path/keyFilePath"`|Yes|Specifies the storage path of the key file. It takes effect when the value of `nebula.ssl.sign` is `ca`.|
|`nebula.ssl.self.param.crtFilePath`|string|`"/path/crtFilePath"`|Yes|Specifies the storage path of the CRT certificate. It takes effect when the value of `nebula.ssl.sign` is `self`.|
|`nebula.ssl.self.param.keyFilePath`|string|`"/path/keyFilePath"`|Yes|Specifies the storage path of the key file. It takes effect when the value of `nebula.ssl.sign` is `self`.|
|`nebula.ssl.self.param.password`|string|`"nebula"`|Yes|Specifies the storage path of the password. It takes effect when the value of `nebula.ssl.sign` is `self`.|
|`nebula.path.local`|string|`"/tmp"`|No|The local SST file path which needs to be set when users import SST files.|
|`nebula.path.remote`|string|`"/sst"`|No|The remote SST file path which needs to be set when users import SST files.|
|`nebula.path.hdfs.namenode`|string|`"hdfs://name_node:9000"`|No|The NameNode path which needs to be set when users import SST files.|
Expand Down Expand Up @@ -150,7 +158,7 @@ For different data sources, the vertex configurations are different. There are m
|`tags.host`|string|`127.0.0.1`|Yes|The Hbase server address.|
|`tags.port`|string|`2181`|Yes|The Hbase server port.
|`tags.table`|string|-|Yes|The name of a table used as a data source.|
|`tags.columnFamily`|string|-|Yes|The column family which a table belongs to.|
|`tags.columnFamily`|string|-|Yes|The column family to which a table belongs.|

### Specific parameters of Pulsar data sources

Expand All @@ -175,6 +183,18 @@ For different data sources, the vertex configurations are different. There are m
|:---|:---|:---|:---|:---|
|`tags.path`|string|-|Yes|The path of the source file specified to generate SST files.|

### Specific parameters of Nebula Graph

!!! enterpriseonly

Specific parameters of Nebula Graph are used for exporting Nebula Graph data, which is supported by Exchange Enterprise Edition only.

|Parameter|Data type|Default value|Required|Description|
|:---|:---|:---|:---|:---|
|`tags.path`|string|`"hdfs://namenode:9000/path/vertex"`|Yes|Specifies the storage path of the CSV file. You need to set a new path and Exchange will automatically create the path you set. If you store the data to the HDFS server, the path format is the same as the default value, such as `"hdfs://192.168.8.177:9000/vertex/player"`. If you store the data to the local, the path format is `"file:///path/vertex"`, such as `"file:///home/nebula/vertex/player"`. If there are multiple Tags, different directories must be set for each Tag.|
|`tags.noField`|bool|`false`|Yes|If the value is `true`, only VIDs will be exported, not the property data. If the value is `false`, VIDs and the property data will be exported.|
|`tags.return.fields`|list|`[]`|Yes|Specifies the properties to be exported. For example, to export the `name` and `age`, you need to set the parameter value to `["name","age"]`. This parameter only takes effect when the value of `tags.noField` is `false`.|

## Edge configurations

For different data sources, configurations of edges are also different. There are general parameters and some specific parameters. General parameters and specific parameters of different data sources need to be configured when users configure edges.
Expand All @@ -195,3 +215,11 @@ For the specific parameters of different data sources for edge configurations, p
|`edges.ranking`|int|-|No|The column of rank values. If not specified, all rank values are `0` by default.|
|`edges.batch`|int|`256`|Yes|The maximum number of edges written into Nebula Graph in a single batch.|
|`edges.partition`|int|`32`|Yes|The number of Spark partitions.|

### Specific parameters of Nebula Graph

|Parameter|Type|Default value|Required|Description|
|:---|:---|:---|:---|:---|
|`edges.path`|string|`"hdfs://namenode:9000/path/edge"`|Yes|Specifies the storage path of the CSV file. You need to set a new path and Exchange will automatically create the path you set. If you store the data to the HDFS server, the path format is the same as the default value, such as `"hdfs://192.168.8.177:9000/edge/follow"`. If you store the data to the local, the path format is `"file:///path/edge"`, such as `"file:///home/nebula/edge/follow"`. If there are multiple Edges, different directories must be set for each Edge.|
|`edges.noField`|bool|`false`|Yes|If the value is `true`, source vertex IDs, destination vertex IDs, and ranks will be exported, not the property data. If the vaue is `false`, ranks, source vertex IDs, destination vertex IDs, ranks, and the property data will be exported.|
|`edges.return.fields`|list|`[]`|Yes|Specifies the properties to be exported. For example, to export `start_year` and `end_year`, you need to set the parameter value to `["start_year","end_year"]`. This parameter only takes effect when the value of `edges.noField` is `false`.|
148 changes: 148 additions & 0 deletions docs-2.0/nebula-exchange/use-exchange/ex-ug-export-from-nebula.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Export data from Nebula Graph

This topic uses an example to illustrate how to use Exchange to export data from Nebula Graph to a CSV file.

!!! enterpriseonly

Only Exchange Enterprise Edition supports exporting data from Nebula Graph to a CSV file.

!!! note

SSL encryption is not supported when exporting data from Nebula Graph.

## Preparation

This example is completed on a virtual machine equipped with Linux. The hardware and software you need to prepare before exporting data are as follows.

### Hardware

| Type | Information |
| - | - |
| CPU | 4 Intel(R) Xeon(R) Platinum 8260 CPU @ 2.30GHz |
| Memory | 16G |
| Hard disk | 50G |

### System

CentOS 7.9.2009

### Software

| Name | Version |
| - | - |
| JDK | 1.8.0 |
| Hadoop | 2.10.1 |
| Scala | 2.12.11 |
| Spark | 2.4.7 |
| Nebula Graph | {{nebula.release}} |

### Dataset

As the data source, Nebula Graph stores the [basketballplayer dataset](https://docs.nebula-graph.io/2.0/basketballplayer-2.X.ngql) in this example, the Schema elements of which are shown as follows.

| Element | Name | Property |
| :--- | :--- | :--- |
| Tag | `player` | `name string, age int` |
| Tag | `team` | `name string` |
| Edge type | `follow` | `degree int` |
| Edge type | `serve` | `start_year int, end_year int` |

## Steps

1. Get the JAR file of Exchange Enterprise Edition from the [Nebula Graph Enterprise Edition Package](https://nebula-graph.com.cn/pricing/).

2. Modify the configuration file.

Exchange Enterprise Edition provides the configuration template `export_application.conf` for exporting Nebula Graph data. For details, see [Exchange parameters](../parameter-reference/ex-ug-parameter.md). The core content of the configuration file used in this example is as follows:

```conf
...
# Processing tags
# There are tag config examples for different dataSources.
tags: [
# export NebulaGraph tag data to csv, only support export to CSV for now.
{
name: player
type: {
source: Nebula
sink: CSV
}
# the path to save the NebulaGrpah data, make sure the path doesn't exist.
path:"hdfs://192.168.8.177:9000/vertex/player"
# if no need to export any properties when export NebulaGraph tag data
# if noField is configured true, just export vertexId
noField:false
# define properties to export from NebulaGraph tag data
# if return.fields is configured as empty list, then export all properties
return.fields:[]
# nebula space partition number
partition:10
}
...
]
# Processing edges
# There are edge config examples for different dataSources.
edges: [
# export NebulaGraph tag data to csv, only support export to CSV for now.
{
name: follow
type: {
source: Nebula
sink: CSV
}
# the path to save the NebulaGrpah data, make sure the path doesn't exist.
path:"hdfs://192.168.8.177:9000/edge/follow"
# if no need to export any properties when export NebulaGraph edge data
# if noField is configured true, just export src,dst,rank
noField:false
# define properties to export from NebulaGraph edge data
# if return.fields is configured as empty list, then export all properties
return.fields:[]
# nebula space partition number
partition:10
}
...
]
}
```

3. Export data from Nebula Graph with the following command.

```bash
<spark_install_path>/bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange nebula-exchange-x.y.z.jar_path> -c <export_application.conf_path>
```

The command used in this example is as follows.

```bash
$ ./spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange \
~/exchange-ent/nebula-exchange-ent-{{exchange.release}}.jar -c ~/exchange-ent/export_application.conf
```

4. Check the exported data.

1. Check whether the CSV file is successfully generated under the target path.

```bash
$ hadoop fs -ls /vertex/player
Found 11 items
-rw-r--r-- 3 nebula supergroup 0 2021-11-05 07:36 /vertex/player/_SUCCESS
-rw-r--r-- 3 nebula supergroup 160 2021-11-05 07:36 /vertex/player/ part-00000-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r-- 3 nebula supergroup 163 2021-11-05 07:36 /vertex/player/ part-00001-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r-- 3 nebula supergroup 172 2021-11-05 07:36 /vertex/player/ part-00002-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r-- 3 nebula supergroup 172 2021-11-05 07:36 /vertex/player/ part-00003-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r-- 3 nebula supergroup 144 2021-11-05 07:36 /vertex/player/ part-00004-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r-- 3 nebula supergroup 173 2021-11-05 07:36 /vertex/player/ part-00005-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r-- 3 nebula supergroup 160 2021-11-05 07:36 /vertex/player/ part-00006-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r-- 3 nebula supergroup 148 2021-11-05 07:36 /vertex/player/ part-00007-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r-- 3 nebula supergroup 125 2021-11-05 07:36 /vertex/player/ part-00008-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r-- 3 nebula supergroup 119 2021-11-05 07:36 /vertex/player/ part-00009-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
```

2. Check the contents of the CSV file to ensure that the data export is successful.
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -433,7 +433,7 @@ nav:
- Introduction:
- What is Nebula Exchange: nebula-exchange/about-exchange/ex-ug-what-is-exchange.md
- Limitations: nebula-exchange/about-exchange/ex-ug-limitations.md
- Compile Exchange: nebula-exchange/ex-ug-compile.md
- Get Exchange: nebula-exchange/ex-ug-compile.md
- Exchange configurations:
- Options for import: nebula-exchange/parameter-reference/ex-ug-para-import-command.md
- Parameters in the configuration file: nebula-exchange/parameter-reference/ex-ug-parameter.md
Expand All @@ -451,6 +451,7 @@ nav:
- Import data from Pulsar: nebula-exchange/use-exchange/ex-ug-import-from-pulsar.md
- Import data from Kafka: nebula-exchange/use-exchange/ex-ug-import-from-kafka.md
- Import data from SST files: nebula-exchange/use-exchange/ex-ug-import-from-sst.md
- Export data from Nebula Graph: nebula-exchange/use-exchange/ex-ug-export-from-nebula.md
- Exchange FAQ: nebula-exchange/ex-ug-FAQ.md

- Nebula Operator:
Expand Down

0 comments on commit c600409

Please sign in to comment.