You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "/data/miniconda3/envs/env-3.8.8/lib/python3.8/site-packages/datasets/builder.py", line 1874, in _prepare_split_single
writer.write_table(table)
File "/data/miniconda3/envs/env-3.8.8/lib/python3.8/site-packages/datasets/arrow_writer.py", line 567, in write_table
pa_table = pa_table.combine_chunks()
File "pyarrow/table.pxi", line 3315, in pyarrow.lib.Table.combine_chunks
File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: offset overflow while concatenating arrays
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "run_uie_pretrain.py", line 509, in <module>
main()
File "run_uie_pretrain.py", line 148, in main
datasets = load_dataset(
File "/data/miniconda3/envs/env-3.8.8/lib/python3.8/site-packages/datasets/load.py", line 1782, in load_dataset
builder_instance.download_and_prepare(
File "/data/miniconda3/envs/env-3.8.8/lib/python3.8/site-packages/datasets/builder.py", line 872, in download_and_prepare
self._download_and_prepare(
File "/data/miniconda3/envs/env-3.8.8/lib/python3.8/site-packages/datasets/builder.py", line 967, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/data/miniconda3/envs/env-3.8.8/lib/python3.8/site-packages/datasets/builder.py", line 1749, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File "/data/miniconda3/envs/env-3.8.8/lib/python3.8/site-packages/datasets/builder.py", line 1892, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.builder.DatasetGenerationError: An error occurred while generating the dataset
陆博您好,很感谢您公开UIE模型的代码!
在程序加载构造的预训练数据时,报了以下错误:
当数据集规模为500w时,会报以上的错误,而当数据集规模减少至100w时,程序可以正常运行,因此从报错原因来看是因为数据集太大从而导致加载出错,而且此时内存未满。
因此有几个问题想请教您:
The text was updated successfully, but these errors were encountered: