diff --git a/docs-2.0/nebula-exchange/about-exchange/ex-ug-limitations.md b/docs-2.0/nebula-exchange/about-exchange/ex-ug-limitations.md index f7674275c7c..a2762c6507a 100644 --- a/docs-2.0/nebula-exchange/about-exchange/ex-ug-limitations.md +++ b/docs-2.0/nebula-exchange/about-exchange/ex-ug-limitations.md @@ -2,9 +2,9 @@ This topic describes some of the limitations of using Exchange 2.x. -## Nebula Graph releases +## Version compatibility -The correspondence between the Nebula Exchange release (the JAR version) and the Nebula Graph release is as follows. +The correspondence between the Nebula Exchange release (the JAR version) and the Nebula Graph core release is as follows. |Exchange client|Nebula Graph| |:---|:---| diff --git a/docs-2.0/nebula-exchange/about-exchange/ex-ug-what-is-exchange.md b/docs-2.0/nebula-exchange/about-exchange/ex-ug-what-is-exchange.md index 3d3b49b02f3..dd7db4b2bb8 100644 --- a/docs-2.0/nebula-exchange/about-exchange/ex-ug-what-is-exchange.md +++ b/docs-2.0/nebula-exchange/about-exchange/ex-ug-what-is-exchange.md @@ -6,6 +6,10 @@ Exchange consists of Reader, Processor, and Writer. After Reader reads data from ![Nebula Graph® Exchange consists of Reader, Processor, and Writer that can migrate data from a variety of formats and sources to Nebula Graph](../figs/ex-ug-003.png) +## Editions + +Exchange has two editions, the Community Edition and the Enterprise Edition. The Community Edition is open source developed on [GitHub](https://github.com/vesoft-inc/nebula-exchange). The Enterprise Edition supports not only the functions of the Community Edition but also adds additional features. For details, see [Comparisons](https://nebula-graph.com.cn/pricing/). + ## Scenarios Exchange applies to the following scenarios: @@ -16,6 +20,12 @@ Exchange applies to the following scenarios: - A large volume of data needs to be generated into SST files that Nebula Graph can recognize and then imported into the Nebula Graph database. +- The data saved in Nebula Graph needs to be exported. + + !!! enterpriseonly + + Exporting the data saved in Nebula Graph is supported by Exchange Enterprise Edition only. + ## Advantages Exchange has the following advantages: @@ -24,6 +34,8 @@ Exchange has the following advantages: - SST import: It supports converting data from different sources into SST files for data import. +- SSL encryption: It supports establishing the SSL encryption between Exchange and Nebula Graph to ensure data security. + - Resumable data import: It supports resumable data import to save time and improve data import efficiency. !!! note @@ -40,7 +52,7 @@ Exchange has the following advantages: ## Data source -Exchange {{exchange.release}} supports converting data from the following formats or sources into vertexes and edges that Nebula Graph can recognize, and then importing them into Nebula Graph in the form of **nGQL** statements: +Exchange {{exchange.release}} supports converting data from the following formats or sources into vertexes and edges that Nebula Graph can recognize, and then importing them into Nebula Graph in the form of nGQL statements: - Data stored in HDFS or locally: - [Apache Parquet](../use-exchange/ex-ug-import-from-parquet.md) @@ -65,4 +77,6 @@ Exchange {{exchange.release}} supports converting data from the following format - Publish/Subscribe messaging platform: [Apache Pulsar 2.4.5](../use-exchange/ex-ug-import-from-pulsar.md) -In addition to importing data as nGQL statements, Exchange supports generating **SST files** for data sources and then [importing SST](../use-exchange/ex-ug-import-from-sst.md) files via Console. +In addition to importing data as nGQL statements, Exchange supports generating SST files for data sources and then [importing SST](../use-exchange/ex-ug-import-from-sst.md) files via Console. + +In addition, Exchange Enterprise Edition also supports [exporting data to a CSV file](../use-exchange/ex-ug-export-from-nebula.md) using Nebula Graph as data sources. diff --git a/docs-2.0/nebula-exchange/ex-ug-compile.md b/docs-2.0/nebula-exchange/ex-ug-compile.md index 7606fa67df8..a8b4233561b 100644 --- a/docs-2.0/nebula-exchange/ex-ug-compile.md +++ b/docs-2.0/nebula-exchange/ex-ug-compile.md @@ -1,8 +1,22 @@ -# Compile Exchange +# Get Exchange -This topic describes how to compile Nebula Exchange. Users can also [download](https://repo1.maven.org/maven2/com/vesoft/nebula-exchange/) the compiled `.jar` file directly. +This topic introduces how to get the JAR file of Nebula Exchange. -## Prerequisites +## Download the JAR file directly + +The JAR file of Exchange Community Edition can be [downloaded](https://repo1.maven.org/maven2/com/vesoft/nebula-exchange/) directly. + +To download Exchange Enterprise Edition, [get Nebula Graph Enterprise Edition Package](https://nebula-graph.com.cn/pricing/) first. + +## Get the JAR file by compiling the source code + +You can get the JAR file of Exchange Community Edition by compiling the source code. The following introduces how to compile the source code of Exchange. + +!!! enterpriseonly + + You can get Exchange Enterprise Edition in Nebula Graph Enterprise Edition Package only. + +### Prerequisites - Install [Maven](https://maven.apache.org/download.cgi). @@ -58,7 +72,7 @@ In the `target` directory, users can find the `exchange-2.x.y.jar` file. When migrating data, you can refer to configuration file [`target/classes/application.conf`](https://github.com/vesoft-inc/nebula-exchange/blob/master/nebula-exchange/src/main/resources/application.conf). -## Failed to download the dependency package +### Failed to download the dependency package If downloading dependencies fails when compiling: diff --git a/docs-2.0/nebula-exchange/parameter-reference/ex-ug-parameter.md b/docs-2.0/nebula-exchange/parameter-reference/ex-ug-parameter.md index c07fab42565..20505348af2 100644 --- a/docs-2.0/nebula-exchange/parameter-reference/ex-ug-parameter.md +++ b/docs-2.0/nebula-exchange/parameter-reference/ex-ug-parameter.md @@ -49,6 +49,14 @@ Users only need to configure parameters for connecting to Hive if Spark and Hive |`nebula.user`|string|-|Yes|The username with write permissions for Nebula Graph.| |`nebula.pswd`|string|-|Yes|The account password.| |`nebula.space`|string|-|Yes|The name of the graph space where data needs to be imported.| +|`nebula.ssl.enable.graph`|bool|`false`|Yes|Enables the [SSL encryption](https://en.wikipedia.org/wiki/Transport_Layer_Security) between Exchange and Graph services. If the value is `true`, the SSL encryption is enabled and the following SSL parameters take effect. If Exchange is run on a multi-machine cluster, you need to store the corresponding files in the same path on each machine when setting the following SSL-related paths.| +|`nebula.ssl.sign`|string|`ca`|Yes|Specifies the SSL sign. Optional values are `ca` and `self`.| +|`nebula.ssl.ca.param.caCrtFilePath`|string|Specifies the storage path of the CA certificate. It takes effect when the value of `nebula.ssl.sign` is `ca`.| +|`nebula.ssl.ca.param.crtFilePath`|string|`"/path/crtFilePath"`|Yes|Specifies the storage path of the CRT certificate. It takes effect when the value of `nebula.ssl.sign` is `ca`.| +|`nebula.ssl.ca.param.keyFilePath`|string|`"/path/keyFilePath"`|Yes|Specifies the storage path of the key file. It takes effect when the value of `nebula.ssl.sign` is `ca`.| +|`nebula.ssl.self.param.crtFilePath`|string|`"/path/crtFilePath"`|Yes|Specifies the storage path of the CRT certificate. It takes effect when the value of `nebula.ssl.sign` is `self`.| +|`nebula.ssl.self.param.keyFilePath`|string|`"/path/keyFilePath"`|Yes|Specifies the storage path of the key file. It takes effect when the value of `nebula.ssl.sign` is `self`.| +|`nebula.ssl.self.param.password`|string|`"nebula"`|Yes|Specifies the storage path of the password. It takes effect when the value of `nebula.ssl.sign` is `self`.| |`nebula.path.local`|string|`"/tmp"`|No|The local SST file path which needs to be set when users import SST files.| |`nebula.path.remote`|string|`"/sst"`|No|The remote SST file path which needs to be set when users import SST files.| |`nebula.path.hdfs.namenode`|string|`"hdfs://name_node:9000"`|No|The NameNode path which needs to be set when users import SST files.| @@ -150,7 +158,7 @@ For different data sources, the vertex configurations are different. There are m |`tags.host`|string|`127.0.0.1`|Yes|The Hbase server address.| |`tags.port`|string|`2181`|Yes|The Hbase server port. |`tags.table`|string|-|Yes|The name of a table used as a data source.| -|`tags.columnFamily`|string|-|Yes|The column family which a table belongs to.| +|`tags.columnFamily`|string|-|Yes|The column family to which a table belongs.| ### Specific parameters of Pulsar data sources @@ -175,6 +183,18 @@ For different data sources, the vertex configurations are different. There are m |:---|:---|:---|:---|:---| |`tags.path`|string|-|Yes|The path of the source file specified to generate SST files.| +### Specific parameters of Nebula Graph + +!!! enterpriseonly + + Specific parameters of Nebula Graph are used for exporting Nebula Graph data, which is supported by Exchange Enterprise Edition only. + +|Parameter|Data type|Default value|Required|Description| +|:---|:---|:---|:---|:---| +|`tags.path`|string|`"hdfs://namenode:9000/path/vertex"`|Yes|Specifies the storage path of the CSV file. You need to set a new path and Exchange will automatically create the path you set. If you store the data to the HDFS server, the path format is the same as the default value, such as `"hdfs://192.168.8.177:9000/vertex/player"`. If you store the data to the local, the path format is `"file:///path/vertex"`, such as `"file:///home/nebula/vertex/player"`. If there are multiple Tags, different directories must be set for each Tag.| +|`tags.noField`|bool|`false`|Yes|If the value is `true`, only VIDs will be exported, not the property data. If the value is `false`, VIDs and the property data will be exported.| +|`tags.return.fields`|list|`[]`|Yes|Specifies the properties to be exported. For example, to export the `name` and `age`, you need to set the parameter value to `["name","age"]`. This parameter only takes effect when the value of `tags.noField` is `false`.| + ## Edge configurations For different data sources, configurations of edges are also different. There are general parameters and some specific parameters. General parameters and specific parameters of different data sources need to be configured when users configure edges. @@ -195,3 +215,11 @@ For the specific parameters of different data sources for edge configurations, p |`edges.ranking`|int|-|No|The column of rank values. If not specified, all rank values are `0` by default.| |`edges.batch`|int|`256`|Yes|The maximum number of edges written into Nebula Graph in a single batch.| |`edges.partition`|int|`32`|Yes|The number of Spark partitions.| + +### Specific parameters of Nebula Graph + +|Parameter|Type|Default value|Required|Description| +|:---|:---|:---|:---|:---| +|`edges.path`|string|`"hdfs://namenode:9000/path/edge"`|Yes|Specifies the storage path of the CSV file. You need to set a new path and Exchange will automatically create the path you set. If you store the data to the HDFS server, the path format is the same as the default value, such as `"hdfs://192.168.8.177:9000/edge/follow"`. If you store the data to the local, the path format is `"file:///path/edge"`, such as `"file:///home/nebula/edge/follow"`. If there are multiple Edges, different directories must be set for each Edge.| +|`edges.noField`|bool|`false`|Yes|If the value is `true`, source vertex IDs, destination vertex IDs, and ranks will be exported, not the property data. If the vaue is `false`, ranks, source vertex IDs, destination vertex IDs, ranks, and the property data will be exported.| +|`edges.return.fields`|list|`[]`|Yes|Specifies the properties to be exported. For example, to export `start_year` and `end_year`, you need to set the parameter value to `["start_year","end_year"]`. This parameter only takes effect when the value of `edges.noField` is `false`.| diff --git a/docs-2.0/nebula-exchange/use-exchange/ex-ug-export-from-nebula.md b/docs-2.0/nebula-exchange/use-exchange/ex-ug-export-from-nebula.md new file mode 100644 index 00000000000..f1ffcc2b114 --- /dev/null +++ b/docs-2.0/nebula-exchange/use-exchange/ex-ug-export-from-nebula.md @@ -0,0 +1,148 @@ +# Export data from Nebula Graph + +This topic uses an example to illustrate how to use Exchange to export data from Nebula Graph to a CSV file. + +!!! enterpriseonly + + Only Exchange Enterprise Edition supports exporting data from Nebula Graph to a CSV file. + +!!! note + + SSL encryption is not supported when exporting data from Nebula Graph. + +## Preparation + +This example is completed on a virtual machine equipped with Linux. The hardware and software you need to prepare before exporting data are as follows. + +### Hardware + +| Type | Information | +| - | - | +| CPU | 4 Intel(R) Xeon(R) Platinum 8260 CPU @ 2.30GHz | +| Memory | 16G | +| Hard disk | 50G | + +### System + +CentOS 7.9.2009 + +### Software + +| Name | Version | +| - | - | +| JDK | 1.8.0 | +| Hadoop | 2.10.1 | +| Scala | 2.12.11 | +| Spark | 2.4.7 | +| Nebula Graph | {{nebula.release}} | + +### Dataset + +As the data source, Nebula Graph stores the [basketballplayer dataset](https://docs.nebula-graph.io/2.0/basketballplayer-2.X.ngql) in this example, the Schema elements of which are shown as follows. + +| Element | Name | Property | +| :--- | :--- | :--- | +| Tag | `player` | `name string, age int` | +| Tag | `team` | `name string` | +| Edge type | `follow` | `degree int` | +| Edge type | `serve` | `start_year int, end_year int` | + +## Steps + +1. Get the JAR file of Exchange Enterprise Edition from the [Nebula Graph Enterprise Edition Package](https://nebula-graph.com.cn/pricing/). + +2. Modify the configuration file. + + Exchange Enterprise Edition provides the configuration template `export_application.conf` for exporting Nebula Graph data. For details, see [Exchange parameters](../parameter-reference/ex-ug-parameter.md). The core content of the configuration file used in this example is as follows: + + ```conf + ... + + # Processing tags + # There are tag config examples for different dataSources. + tags: [ + # export NebulaGraph tag data to csv, only support export to CSV for now. + { + name: player + type: { + source: Nebula + sink: CSV + } + # the path to save the NebulaGrpah data, make sure the path doesn't exist. + path:"hdfs://192.168.8.177:9000/vertex/player" + # if no need to export any properties when export NebulaGraph tag data + # if noField is configured true, just export vertexId + noField:false + # define properties to export from NebulaGraph tag data + # if return.fields is configured as empty list, then export all properties + return.fields:[] + # nebula space partition number + partition:10 + } + + ... + + ] + + # Processing edges + # There are edge config examples for different dataSources. + edges: [ + # export NebulaGraph tag data to csv, only support export to CSV for now. + { + name: follow + type: { + source: Nebula + sink: CSV + } + # the path to save the NebulaGrpah data, make sure the path doesn't exist. + path:"hdfs://192.168.8.177:9000/edge/follow" + # if no need to export any properties when export NebulaGraph edge data + # if noField is configured true, just export src,dst,rank + noField:false + # define properties to export from NebulaGraph edge data + # if return.fields is configured as empty list, then export all properties + return.fields:[] + # nebula space partition number + partition:10 + } + + ... + + ] + } + ``` + +3. Export data from Nebula Graph with the following command. + + ```bash + /bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange nebula-exchange-x.y.z.jar_path> -c + ``` + + The command used in this example is as follows. + + ```bash + $ ./spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange \ + ~/exchange-ent/nebula-exchange-ent-{{exchange.release}}.jar -c ~/exchange-ent/export_application.conf + ``` + +4. Check the exported data. + + 1. Check whether the CSV file is successfully generated under the target path. + + ```bash + $ hadoop fs -ls /vertex/player + Found 11 items + -rw-r--r-- 3 nebula supergroup 0 2021-11-05 07:36 /vertex/player/_SUCCESS + -rw-r--r-- 3 nebula supergroup 160 2021-11-05 07:36 /vertex/player/ part-00000-17293020-ba2e-4243-b834-34495c0536b3-c000.csv + -rw-r--r-- 3 nebula supergroup 163 2021-11-05 07:36 /vertex/player/ part-00001-17293020-ba2e-4243-b834-34495c0536b3-c000.csv + -rw-r--r-- 3 nebula supergroup 172 2021-11-05 07:36 /vertex/player/ part-00002-17293020-ba2e-4243-b834-34495c0536b3-c000.csv + -rw-r--r-- 3 nebula supergroup 172 2021-11-05 07:36 /vertex/player/ part-00003-17293020-ba2e-4243-b834-34495c0536b3-c000.csv + -rw-r--r-- 3 nebula supergroup 144 2021-11-05 07:36 /vertex/player/ part-00004-17293020-ba2e-4243-b834-34495c0536b3-c000.csv + -rw-r--r-- 3 nebula supergroup 173 2021-11-05 07:36 /vertex/player/ part-00005-17293020-ba2e-4243-b834-34495c0536b3-c000.csv + -rw-r--r-- 3 nebula supergroup 160 2021-11-05 07:36 /vertex/player/ part-00006-17293020-ba2e-4243-b834-34495c0536b3-c000.csv + -rw-r--r-- 3 nebula supergroup 148 2021-11-05 07:36 /vertex/player/ part-00007-17293020-ba2e-4243-b834-34495c0536b3-c000.csv + -rw-r--r-- 3 nebula supergroup 125 2021-11-05 07:36 /vertex/player/ part-00008-17293020-ba2e-4243-b834-34495c0536b3-c000.csv + -rw-r--r-- 3 nebula supergroup 119 2021-11-05 07:36 /vertex/player/ part-00009-17293020-ba2e-4243-b834-34495c0536b3-c000.csv + ``` + + 2. Check the contents of the CSV file to ensure that the data export is successful. diff --git a/mkdocs.yml b/mkdocs.yml index 8c3c07eef1b..a86e3e3e290 100755 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -433,7 +433,7 @@ nav: - Introduction: - What is Nebula Exchange: nebula-exchange/about-exchange/ex-ug-what-is-exchange.md - Limitations: nebula-exchange/about-exchange/ex-ug-limitations.md - - Compile Exchange: nebula-exchange/ex-ug-compile.md + - Get Exchange: nebula-exchange/ex-ug-compile.md - Exchange configurations: - Options for import: nebula-exchange/parameter-reference/ex-ug-para-import-command.md - Parameters in the configuration file: nebula-exchange/parameter-reference/ex-ug-parameter.md @@ -451,6 +451,7 @@ nav: - Import data from Pulsar: nebula-exchange/use-exchange/ex-ug-import-from-pulsar.md - Import data from Kafka: nebula-exchange/use-exchange/ex-ug-import-from-kafka.md - Import data from SST files: nebula-exchange/use-exchange/ex-ug-import-from-sst.md + - Export data from Nebula Graph: nebula-exchange/use-exchange/ex-ug-export-from-nebula.md - Exchange FAQ: nebula-exchange/ex-ug-FAQ.md - Nebula Operator: