Spark related products add utf8 encoding hints (#2395)

#2391
vesoft-inc · Dec 18, 2023 · b4fc696 · b4fc696
1 parent dc95bc6
commit b4fc696
Show file tree

Hide file tree

Showing 8 changed files with 58 additions and 4 deletions.
diff --git a/docs-2.0-en/connector/nebula-spark-connector.md b/docs-2.0-en/connector/nebula-spark-connector.md
@@ -125,6 +125,15 @@ dataframe.write.nebula().writeEdges()
 
 `nebula()` receives two configuration parameters, including connection configuration and read-write configuration.
 
+!!! note
+
+    If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:
+
+    ```
+    --conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
+    --conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
+    ```
+
 ### Reading data from NebulaGraph
 
 ```scala

diff --git a/docs-2.0-en/graph-computing/nebula-algorithm.md b/docs-2.0-en/graph-computing/nebula-algorithm.md
@@ -105,6 +105,15 @@ After the compilation, a similar file `nebula-algorithm-3.x.x.jar` is generated
 
 ## How to use
 
+!!! note
+
+    If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:
+
+    ```
+    --conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
+    --conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
+    ```
+
 ### Use algorithm interface (recommended)
 
 The `lib` repository provides 10 common graph algorithms.

diff --git a/docs-2.0-en/import-export/nebula-exchange/ex-ug-FAQ.md b/docs-2.0-en/import-export/nebula-exchange/ex-ug-FAQ.md
@@ -107,9 +107,9 @@ Check that the NebulaGraph service port is configured correctly.
 
 Check whether the version of Exchange is the same as that of NebulaGraph. For more information, see [Limitations](about-exchange/ex-ug-limitations.md).
 
-### Q: How to correct the messy code when importing Hive data into NebulaGraph?
+### Q: How to correct the encoding error when importing data in a Spark environment?
 
-It may happen if the property value of the data in Hive contains Chinese characters. The solution is to add the following options before the JAR package path in the import command:
+It may happen if the property value of the data contains Chinese characters. The solution is to add the following options before the JAR package path in the import command:
 
 ```bash
 --conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8

diff --git a/.../import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md b/.../import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md
@@ -8,6 +8,15 @@ After editing the configuration file, run the following commands to import speci
 <spark_install_path>/bin/spark-submit --master "spark://HOST:PORT" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path> 
 ```
 
+!!! note
+
+    If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:
+
+    ```
+    --conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
+    --conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
+    ```
+
 The following table lists command parameters.
 
 | Parameter | Required | Default value | Description |

diff --git a/docs-2.0-zh/connector/nebula-spark-connector.md b/docs-2.0-zh/connector/nebula-spark-connector.md
@@ -126,6 +126,15 @@ dataframe.write.nebula().writeEdges()
 
 `nebula()`接收两个配置参数，包括连接配置和读写配置。
 
+!!! note
+
+    如果数据的属性值包含中文字符，可能出现乱码。请在提交 Spark 任务时加上以下选项：
+
+    ```
+    --conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
+    --conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
+    ```
+
 ### 从 {{nebula.name}} 读取数据
 
 ```scala

diff --git a/docs-2.0-zh/graph-computing/nebula-algorithm.md b/docs-2.0-zh/graph-computing/nebula-algorithm.md
@@ -106,6 +106,15 @@ NebulaGraph Algorithm 实现图计算的流程如下：
 
 ## 使用方法
 
+!!! note
+
+    如果数据的属性值包含中文字符，可能出现乱码。请在提交 Spark 任务时加上以下选项：
+
+    ```
+    --conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
+    --conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
+    ```
+
 ### 调用算法接口（推荐）
 
 `lib`库中提供了 10 种常用图计算算法，用户可以通过编程调用的形式调用算法。

diff --git a/docs-2.0-zh/import-export/nebula-exchange/ex-ug-FAQ.md b/docs-2.0-zh/import-export/nebula-exchange/ex-ug-FAQ.md
@@ -107,9 +107,9 @@ nebula-exchange-3.0.0.jar \
 
 检查 Exchange 版本与 {{nebula.name}} 版本是否匹配，详细信息可参考[使用限制](about-exchange/ex-ug-limitations.md)。
 
-### Q：将 Hive 中的数据导入 {{nebula.name}} 时出现乱码如何解决？
+### Q：Spark 环境中导入数据时出现乱码如何解决？
 
-如果 Hive 中数据的属性值包含中文字符，可能出现该情况。解决方案是在导入命令中的 JAR 包路径前加上以下选项：
+如果数据的属性值包含中文字符，可能出现乱码。解决方案是在导入命令中的 JAR 包路径前加上以下选项：
 
 ```bash
 --conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8

diff --git a/.../import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md b/.../import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md
@@ -8,6 +8,15 @@
 <spark_install_path>/bin/spark-submit --master "spark://HOST:PORT" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path> 
 ```
 
+!!! note
+
+    如果数据的属性值包含中文字符，可能出现乱码。请在提交 Spark 任务时加上以下选项：
+
+    ```
+    --conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
+    --conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
+    ```
+
 参数说明如下。
 
 | 参数 | 是否必需 | 默认值 | 说明 |