Skip to content

Commit

Permalink
Spark related products add utf8 encoding hints (#2395)
Browse files Browse the repository at this point in the history
  • Loading branch information
cooper-lzy authored Dec 18, 2023
1 parent dc95bc6 commit b4fc696
Show file tree
Hide file tree
Showing 8 changed files with 58 additions and 4 deletions.
9 changes: 9 additions & 0 deletions docs-2.0-en/connector/nebula-spark-connector.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,15 @@ dataframe.write.nebula().writeEdges()

`nebula()` receives two configuration parameters, including connection configuration and read-write configuration.
!!! note
If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:
```
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
```

### Reading data from NebulaGraph

```scala
Expand Down
9 changes: 9 additions & 0 deletions docs-2.0-en/graph-computing/nebula-algorithm.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,15 @@ After the compilation, a similar file `nebula-algorithm-3.x.x.jar` is generated

## How to use

!!! note

If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:

```
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
```

### Use algorithm interface (recommended)

The `lib` repository provides 10 common graph algorithms.
Expand Down
4 changes: 2 additions & 2 deletions docs-2.0-en/import-export/nebula-exchange/ex-ug-FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,9 +107,9 @@ Check that the NebulaGraph service port is configured correctly.

Check whether the version of Exchange is the same as that of NebulaGraph. For more information, see [Limitations](about-exchange/ex-ug-limitations.md).

### Q: How to correct the messy code when importing Hive data into NebulaGraph?
### Q: How to correct the encoding error when importing data in a Spark environment?

It may happen if the property value of the data in Hive contains Chinese characters. The solution is to add the following options before the JAR package path in the import command:
It may happen if the property value of the data contains Chinese characters. The solution is to add the following options before the JAR package path in the import command:

```bash
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,15 @@ After editing the configuration file, run the following commands to import speci
<spark_install_path>/bin/spark-submit --master "spark://HOST:PORT" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path>
```

!!! note

If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:

```
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
```

The following table lists command parameters.

| Parameter | Required | Default value | Description |
Expand Down
9 changes: 9 additions & 0 deletions docs-2.0-zh/connector/nebula-spark-connector.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,15 @@ dataframe.write.nebula().writeEdges()

`nebula()`接收两个配置参数,包括连接配置和读写配置。
!!! note
如果数据的属性值包含中文字符,可能出现乱码。请在提交 Spark 任务时加上以下选项:
```
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
```

### 从 {{nebula.name}} 读取数据

```scala
Expand Down
9 changes: 9 additions & 0 deletions docs-2.0-zh/graph-computing/nebula-algorithm.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,15 @@ NebulaGraph Algorithm 实现图计算的流程如下:

## 使用方法

!!! note

如果数据的属性值包含中文字符,可能出现乱码。请在提交 Spark 任务时加上以下选项:

```
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
```

### 调用算法接口(推荐)

`lib`库中提供了 10 种常用图计算算法,用户可以通过编程调用的形式调用算法。
Expand Down
4 changes: 2 additions & 2 deletions docs-2.0-zh/import-export/nebula-exchange/ex-ug-FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,9 +107,9 @@ nebula-exchange-3.0.0.jar \

检查 Exchange 版本与 {{nebula.name}} 版本是否匹配,详细信息可参考[使用限制](about-exchange/ex-ug-limitations.md)。

### Q:将 Hive 中的数据导入 {{nebula.name}} 时出现乱码如何解决
### Q:Spark 环境中导入数据时出现乱码如何解决

如果 Hive 中数据的属性值包含中文字符,可能出现该情况。解决方案是在导入命令中的 JAR 包路径前加上以下选项:
如果数据的属性值包含中文字符,可能出现乱码。解决方案是在导入命令中的 JAR 包路径前加上以下选项:

```bash
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,15 @@
<spark_install_path>/bin/spark-submit --master "spark://HOST:PORT" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path>
```

!!! note

如果数据的属性值包含中文字符,可能出现乱码。请在提交 Spark 任务时加上以下选项:

```
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8
--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8
```

参数说明如下。

| 参数 | 是否必需 | 默认值 | 说明 |
Expand Down

0 comments on commit b4fc696

Please sign in to comment.