Skip to content

Commit

Permalink
Update tidb-lightning/tidb-lightning-data-source.md
Browse files Browse the repository at this point in the history
Co-authored-by: xixirangrang <[email protected]>
  • Loading branch information
lichunzhu and hfxsd authored Nov 24, 2022
1 parent 5173d1e commit 71828bb
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion tidb-lightning/tidb-lightning-data-source.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ TiDB Lightning 运行时将查找 `data-source-dir` 中所有符合命令规则
|Schema 文件|包含 `CREATE DATABASE` DDL 语句的文件|`${db_name}-schema-create.sql`|
|数据文件|包含整张表的数据文件,该文件会被导入 `${db_name}.${table_name}`| <code>\${db_name}.\${table_name}.\${csv\|sql\|parquet}</code>|
|数据文件| 如果一个表分布于多个数据文件,这些文件命名需加上文件编号的后缀 | <code>\${db_name}.\${table_name}.001.\${csv\|sql\|parquet}</code> |
|压缩文件| 上述所有类型文件如带压缩文件名后缀,TiDB Lightning 会流式解压后进行导入 | <code>\${db_name}.\${table_name}.\${csv\|sql\|parquet}.{compress}</code> |
|压缩文件| 上述所有类型文件如带压缩文件名后缀,`gzip``snappy``zstd`TiDB Lightning 会流式解压后进行导入 | <code>\${db_name}.\${table_name}.\${csv\|sql\|parquet}.{compress}</code> |

TiDB Lightning 尽量并行处理数据,由于文件必须顺序读取,所以数据处理协程是文件级别的并发(通过 `region-concurrency` 配置控制)。因此导入大文件时性能比较差。通常建议单个文件尺寸为 256MiB,以获得最好的性能。

Expand Down

0 comments on commit 71828bb

Please sign in to comment.