You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[0] /root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
[0] warn(
[1] /root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
[1] warn(
[0] [2024-06-21 23:13:11,872] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to xpu (auto detect)
[1] [2024-06-21 23:13:11,951] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to xpu (auto detect)
[0] [2024-06-21 23:13:12,241] [INFO] [real_accelerator.py:211:set_accelerator] Setting ds_accelerator to cpu (model specified)
[1] [2024-06-21 23:13:12,325] [INFO] [real_accelerator.py:211:set_accelerator] Setting ds_accelerator to cpu (model specified)
Loading checkpoint shards: 100%|██████████| 8/8 [00:16<00:00, 2.04s/it][1]
[1] [2024-06-21 23:13:29,421] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.14.1+ed8aed57, git-hash=ed8aed57, git-branch=HEAD
[1] [2024-06-21 23:13:29,422] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter replace_method is deprecated. This parameter is no longer needed, please remove from your call to DeepSpeed-inference
[1] [2024-06-21 23:13:29,422] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[1] [2024-06-21 23:13:29,422] [INFO] [logging.py:96:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
Loading checkpoint shards: 100%|██████████| 8/8 [00:17<00:00, 2.21s/it][0]
[0] [2024-06-21 23:13:30,640] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.14.1+ed8aed57, git-hash=ed8aed57, git-branch=HEAD
[0] [2024-06-21 23:13:30,640] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter replace_method is deprecated. This parameter is no longer needed, please remove from your call to DeepSpeed-inference
[0] [2024-06-21 23:13:30,640] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[0] [2024-06-21 23:13:30,640] [INFO] [logging.py:96:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
[1] Using /root/.cache/torch_extensions/py311_cpu as PyTorch extensions root...
[1] Emitting ninja build file /root/.cache/torch_extensions/py311_cpu/deepspeed_ccl_comm/build.ninja...
[1] Building extension module deepspeed_ccl_comm...
[1] Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1] ninja: no work to do.
[1] Loading extension module deepspeed_ccl_comm...
[0] Using /root/.cache/torch_extensions/py311_cpu as PyTorch extensions root...
[0] Emitting ninja build file /root/.cache/torch_extensions/py311_cpu/deepspeed_ccl_comm/build.ninja...
[0] Building extension module deepspeed_ccl_comm...
[0] Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[0] ninja: no work to do.
[0] Loading extension module deepspeed_ccl_comm...
[0] My guessed rank = 0
[0] 2024:06:21-23:13:40:(1676750) |CCL_WARN| sockets exchange mode is set. It may cause potential problem of 'Too many open file descriptors'
[1] My guessed rank = 1
[1] 2024:06:21-23:13:40:(1676751) |CCL_WARN| sockets exchange mode is set. It may cause potential problem of 'Too many open file descriptors'
[0] Time to load deepspeed_ccl_comm op: 0.11093568801879883 seconds
[0] DeepSpeed deepspeed.ops.comm.deepspeed_ccl_comm_op built successfully
[0] [2024-06-21 23:13:41,150] [INFO] [comm.py:161:init_deepspeed_backend] Initialize ccl backend
[1] Time to load deepspeed_ccl_comm op: 0.10797476768493652 seconds
[1] DeepSpeed deepspeed.ops.comm.deepspeed_ccl_comm_op built successfully
[1] [2024-06-21 23:13:41,150] [INFO] [comm.py:161:init_deepspeed_backend] Initialize ccl backend
[1] [2024-06-21 23:13:41,150] [INFO] [comm.py:637:init_distributed] cdb=<deepspeed.comm.ccl.CCLBackend object at 0x7fa20a3d3d90>
[0] [2024-06-21 23:13:41,150] [INFO] [comm.py:637:init_distributed] cdb=<deepspeed.comm.ccl.CCLBackend object at 0x7b8ba0ce0510>
[1] [2024-06-21 23:13:41,150] [INFO] [comm.py:652:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment...
[0] [2024-06-21 23:13:41,150] [INFO] [comm.py:652:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment...
[1] [2024-06-21 23:13:41,485] [INFO] [comm.py:702:mpi_discovery] Discovered MPI settings of world_rank=1, local_rank=1, world_size=2, master_addr=172.16.182.230, master_port=29500
[0] [2024-06-21 23:13:41,485] [INFO] [comm.py:702:mpi_discovery] Discovered MPI settings of world_rank=0, local_rank=0, world_size=2, master_addr=172.16.182.230, master_port=29500
[0] [2024-06-21 23:13:41,485] [INFO] [comm.py:662:init_distributed] Distributed backend already initialized
[0] 2024-06-21 23:13:44,774 - ipex_llm.transformers.utils - INFO - Converting the current model to sym_int4 format......
[1] 2024-06-21 23:13:44,774 - ipex_llm.transformers.utils - INFO - Converting the current model to sym_int4 format......
[1] /root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/init.py:412: UserWarning: Initializing zero-element tensors is a no-op
[1] warnings.warn("Initializing zero-element tensors is a no-op")
[0] /root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/init.py:412: UserWarning: Initializing zero-element tensors is a no-op
[0] warnings.warn("Initializing zero-element tensors is a no-op")
[1] AutoTP: [(<class 'transformers.models.qwen2.modeling_qwen2.Qwen2DecoderLayer'>, ['self_attn.o_proj', 'mlp.down_proj'])]
[1] Traceback (most recent call last):
[1] File "/root/test/ipex-llm/python/llm/example/GPU/Deepspeed-AutoTP/deepspeed_autotp.py", line 85, in
[1] model = optimize_model(model.module.to(f'cpu'), low_bit=low_bit).to(torch.float16)
[1] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/optimize.py", line 253, in optimize_model
[1] model = ggml_convert_low_bit(model,
[1] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/convert.py", line 790, in ggml_convert_low_bit
[1] model = _optimize_pre(model)
[1] ^^^^^^^^^^^^^^^^^^^^
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/convert.py", line 739, in _optimize_pre
[1] model.apply(padding_mlp)
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[1] module.apply(fn)
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[1] module.apply(fn)
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[1] module.apply(fn)
[1] [Previous line repeated 1 more time]
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 898, in apply
[1] fn(self)
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/models/qwen2.py", line 304, in padding_mlp
[1] new_gate_weight[:intermediate_size, :] = gate_weight
[1] ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
[1] RuntimeError: The expanded size of the tensor (2560) must match the existing size (5120) at non-singleton dimension 1. Target sizes: [13696, 2560]. Tensor sizes: [6848, 5120]
[0] AutoTP: [(<class 'transformers.models.qwen2.modeling_qwen2.Qwen2DecoderLayer'>, ['self_attn.o_proj', 'mlp.down_proj'])]
[0] Traceback (most recent call last):
[0] File "/root/test/ipex-llm/python/llm/example/GPU/Deepspeed-AutoTP/deepspeed_autotp.py", line 85, in
[0] model = optimize_model(model.module.to(f'cpu'), low_bit=low_bit).to(torch.float16)
[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/optimize.py", line 253, in optimize_model
[0] model = ggml_convert_low_bit(model,
[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/convert.py", line 790, in ggml_convert_low_bit
[0] model = _optimize_pre(model)
[0] ^^^^^^^^^^^^^^^^^^^^
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/convert.py", line 739, in _optimize_pre
[0] model.apply(padding_mlp)
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[0] module.apply(fn)
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[0] module.apply(fn)
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[0] module.apply(fn)
[0] [Previous line repeated 1 more time]
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 898, in apply
[0] fn(self)
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/models/qwen2.py", line 304, in padding_mlp
[0] new_gate_weight[:intermediate_size, :] = gate_weight
[0] ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
[0] RuntimeError: The expanded size of the tensor (2560) must match the existing size (5120) at non-singleton dimension 1. Target sizes: [13696, 2560]. Tensor sizes: [6848, 5120]
[0] free(): invalid pointer
[0]
[0] LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)[0]
[0] LIBXSMM_TARGET: spr [Intel(R) Xeon(R) Gold 6438N]
[0] Registry and code: 13 MB[0]
[0] Command: python [0] deepspee[0] d_autot[0] p.py --[0] repo-id[0] -or-mode[0] l-path[0] /root[0] /ipex-[0] llm/Qw[0] en1.5-[0] 14B-Chat[0] --low[0] -bit sy[0] m_int4[0]
[0] Uptime: 35.240733 s
[1] free(): invalid size
[1]
[1] LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)
[1] LIBXSMM_TARGET: spr [Intel(R) Xeon(R) Gold 6438N]
[1] Registry and code: 13 MB
[1] Command: python deepspeed_autotp.py --repo-id-or-model-path /root/ipex-llm/Qwen1.5-14B-Chat --low-bit sym_int4[1]
[1] Uptime: 35.150173 s
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 1676750 RUNNING AT test-server
= KILLED BY SIGNAL: 6 (Aborted)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 1 PID 1676751 RUNNING AT test-server
= KILLED BY SIGNAL: 6 (Aborted)
The text was updated successfully, but these errors were encountered:
HOST安装的步骤
conda create -n llm python=3.11
conda activate llm
below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.37.0
pip install oneccl_bind_pt==2.1.100 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
configures OneAPI environment variables
source /opt/intel/oneapi/setvars.sh
pip install git+https://github.com/microsoft/DeepSpeed.git@ed8aed5
pip install git+https://github.com/intel/intel-extension-for-deepspeed.git@0eb734b
pip install mpi4py
conda install -c conda-forge -y gperftools=2.10 # to enable tcmalloc
安装的pip包
(llm-deepspeed) root@test-server:~/test/ipex-llm/python/llm/example/GPU/Deepspeed-AutoTP# pip3 freeze
accelerate==0.23.0
annotated-types==0.7.0
bigdl-core-xe-21==2.5.0b20240620
bigdl-core-xe-addons-21==2.5.0b20240620
bigdl-core-xe-batch-21==2.5.0b20240620
certifi==2024.6.2
charset-normalizer==3.3.2
deepspeed @ git+https://github.com/microsoft/DeepSpeed.git@ed8aed5703d97b6e52d0fca3e4be285e21c005f2
filelock==3.15.3
fsspec==2024.6.0
hjson==3.1.0
huggingface-hub==0.23.4
idna==3.7
intel-cmplr-lib-ur==2024.2.0
intel-extension-for-pytorch==2.1.10+xpu
intel-openmp==2024.2.0
intel_extension_for_deepspeed @ file:///root/intel-extension-for-deepspeed
ipex-llm==2.1.0b20240620
Jinja2==3.1.4
MarkupSafe==2.1.5
mpi4py==3.1.6
mpmath==1.3.0
networkx==3.3
ninja==1.11.1.1
numpy==1.26.4
oneccl-bind-pt==2.1.100+xpu
packaging==24.1
pillow==10.3.0
protobuf==5.27.1
psutil==6.0.0
py-cpuinfo==9.0.0
pydantic==2.7.4
pydantic_core==2.18.4
pynvml==11.5.0
PyYAML==6.0.2rc1
regex==2024.5.15
requests==2.32.3
safetensors==0.4.3
sentencepiece==0.2.0
sympy==1.13.0rc2
tabulate==0.9.0
tokenizers==0.15.2
torch==2.1.0a0+cxx11.abi
torchvision==0.16.0a0+cxx11.abi
tqdm==4.66.4
transformers==4.37.0
typing_extensions==4.12.2
urllib3==2.2.2
(llm-deepspeed) root@test-server:~/test/ipex-llm/python/llm/example/GPU/Deepspeed-AutoTP# bash run_qwen_14b_arc_2_card.sh
:: initializing oneAPI environment ...
run_qwen_14b_arc_2_card.sh: BASH_VERSION = 5.1.16(1)-release
args: Using "$@" for setvars.sh arguments: --force
:: ccl -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpcpp-ct -- latest
:: dpl -- latest
:: ipp -- latest
:: ippcp -- latest
:: mkl -- latest
:: mpi -- latest
:: tbb -- latest
:: oneAPI environment initialized ::
[0] /root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from
torchvision.io
, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you havelibjpeg
orlibpng
installed before buildingtorchvision
from source?[0] warn(
[1] /root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from
torchvision.io
, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you havelibjpeg
orlibpng
installed before buildingtorchvision
from source?[1] warn(
[0] [2024-06-21 23:13:11,872] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to xpu (auto detect)
[1] [2024-06-21 23:13:11,951] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to xpu (auto detect)
[0] [2024-06-21 23:13:12,241] [INFO] [real_accelerator.py:211:set_accelerator] Setting ds_accelerator to cpu (model specified)
[1] [2024-06-21 23:13:12,325] [INFO] [real_accelerator.py:211:set_accelerator] Setting ds_accelerator to cpu (model specified)
Loading checkpoint shards: 100%|██████████| 8/8 [00:16<00:00, 2.04s/it][1]
[1] [2024-06-21 23:13:29,421] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.14.1+ed8aed57, git-hash=ed8aed57, git-branch=HEAD
[1] [2024-06-21 23:13:29,422] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter replace_method is deprecated. This parameter is no longer needed, please remove from your call to DeepSpeed-inference
[1] [2024-06-21 23:13:29,422] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[1] [2024-06-21 23:13:29,422] [INFO] [logging.py:96:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
Loading checkpoint shards: 100%|██████████| 8/8 [00:17<00:00, 2.21s/it][0]
[0] [2024-06-21 23:13:30,640] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.14.1+ed8aed57, git-hash=ed8aed57, git-branch=HEAD
[0] [2024-06-21 23:13:30,640] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter replace_method is deprecated. This parameter is no longer needed, please remove from your call to DeepSpeed-inference
[0] [2024-06-21 23:13:30,640] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[0] [2024-06-21 23:13:30,640] [INFO] [logging.py:96:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
[1] Using /root/.cache/torch_extensions/py311_cpu as PyTorch extensions root...
[1] Emitting ninja build file /root/.cache/torch_extensions/py311_cpu/deepspeed_ccl_comm/build.ninja...
[1] Building extension module deepspeed_ccl_comm...
[1] Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1] ninja: no work to do.
[1] Loading extension module deepspeed_ccl_comm...
[0] Using /root/.cache/torch_extensions/py311_cpu as PyTorch extensions root...
[0] Emitting ninja build file /root/.cache/torch_extensions/py311_cpu/deepspeed_ccl_comm/build.ninja...
[0] Building extension module deepspeed_ccl_comm...
[0] Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[0] ninja: no work to do.
[0] Loading extension module deepspeed_ccl_comm...
[0] My guessed rank = 0
[0] 2024:06:21-23:13:40:(1676750) |CCL_WARN| sockets exchange mode is set. It may cause potential problem of 'Too many open file descriptors'
[1] My guessed rank = 1
[1] 2024:06:21-23:13:40:(1676751) |CCL_WARN| sockets exchange mode is set. It may cause potential problem of 'Too many open file descriptors'
[0] Time to load deepspeed_ccl_comm op: 0.11093568801879883 seconds
[0] DeepSpeed deepspeed.ops.comm.deepspeed_ccl_comm_op built successfully
[0] [2024-06-21 23:13:41,150] [INFO] [comm.py:161:init_deepspeed_backend] Initialize ccl backend
[1] Time to load deepspeed_ccl_comm op: 0.10797476768493652 seconds
[1] DeepSpeed deepspeed.ops.comm.deepspeed_ccl_comm_op built successfully
[1] [2024-06-21 23:13:41,150] [INFO] [comm.py:161:init_deepspeed_backend] Initialize ccl backend
[1] [2024-06-21 23:13:41,150] [INFO] [comm.py:637:init_distributed] cdb=<deepspeed.comm.ccl.CCLBackend object at 0x7fa20a3d3d90>
[0] [2024-06-21 23:13:41,150] [INFO] [comm.py:637:init_distributed] cdb=<deepspeed.comm.ccl.CCLBackend object at 0x7b8ba0ce0510>
[1] [2024-06-21 23:13:41,150] [INFO] [comm.py:652:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment...
[0] [2024-06-21 23:13:41,150] [INFO] [comm.py:652:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment...
[1] [2024-06-21 23:13:41,485] [INFO] [comm.py:702:mpi_discovery] Discovered MPI settings of world_rank=1, local_rank=1, world_size=2, master_addr=172.16.182.230, master_port=29500
[0] [2024-06-21 23:13:41,485] [INFO] [comm.py:702:mpi_discovery] Discovered MPI settings of world_rank=0, local_rank=0, world_size=2, master_addr=172.16.182.230, master_port=29500
[0] [2024-06-21 23:13:41,485] [INFO] [comm.py:662:init_distributed] Distributed backend already initialized
[0] 2024-06-21 23:13:44,774 - ipex_llm.transformers.utils - INFO - Converting the current model to sym_int4 format......
[1] 2024-06-21 23:13:44,774 - ipex_llm.transformers.utils - INFO - Converting the current model to sym_int4 format......
[1] /root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/init.py:412: UserWarning: Initializing zero-element tensors is a no-op
[1] warnings.warn("Initializing zero-element tensors is a no-op")
[0] /root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/init.py:412: UserWarning: Initializing zero-element tensors is a no-op
[0] warnings.warn("Initializing zero-element tensors is a no-op")
[1] AutoTP: [(<class 'transformers.models.qwen2.modeling_qwen2.Qwen2DecoderLayer'>, ['self_attn.o_proj', 'mlp.down_proj'])]
[1] Traceback (most recent call last):
[1] File "/root/test/ipex-llm/python/llm/example/GPU/Deepspeed-AutoTP/deepspeed_autotp.py", line 85, in
[1] model = optimize_model(model.module.to(f'cpu'), low_bit=low_bit).to(torch.float16)
[1] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/optimize.py", line 253, in optimize_model
[1] model = ggml_convert_low_bit(model,
[1] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/convert.py", line 790, in ggml_convert_low_bit
[1] model = _optimize_pre(model)
[1] ^^^^^^^^^^^^^^^^^^^^
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/convert.py", line 739, in _optimize_pre
[1] model.apply(padding_mlp)
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[1] module.apply(fn)
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[1] module.apply(fn)
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[1] module.apply(fn)
[1] [Previous line repeated 1 more time]
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 898, in apply
[1] fn(self)
[1] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/models/qwen2.py", line 304, in padding_mlp
[1] new_gate_weight[:intermediate_size, :] = gate_weight
[1] ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
[1] RuntimeError: The expanded size of the tensor (2560) must match the existing size (5120) at non-singleton dimension 1. Target sizes: [13696, 2560]. Tensor sizes: [6848, 5120]
[0] AutoTP: [(<class 'transformers.models.qwen2.modeling_qwen2.Qwen2DecoderLayer'>, ['self_attn.o_proj', 'mlp.down_proj'])]
[0] Traceback (most recent call last):
[0] File "/root/test/ipex-llm/python/llm/example/GPU/Deepspeed-AutoTP/deepspeed_autotp.py", line 85, in
[0] model = optimize_model(model.module.to(f'cpu'), low_bit=low_bit).to(torch.float16)
[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/optimize.py", line 253, in optimize_model
[0] model = ggml_convert_low_bit(model,
[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/convert.py", line 790, in ggml_convert_low_bit
[0] model = _optimize_pre(model)
[0] ^^^^^^^^^^^^^^^^^^^^
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/convert.py", line 739, in _optimize_pre
[0] model.apply(padding_mlp)
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[0] module.apply(fn)
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[0] module.apply(fn)
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 897, in apply
[0] module.apply(fn)
[0] [Previous line repeated 1 more time]
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/torch/nn/modules/module.py", line 898, in apply
[0] fn(self)
[0] File "/root/miniforge3/envs/llm-deepspeed/lib/python3.11/site-packages/ipex_llm/transformers/models/qwen2.py", line 304, in padding_mlp
[0] new_gate_weight[:intermediate_size, :] = gate_weight
[0] ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
[0] RuntimeError: The expanded size of the tensor (2560) must match the existing size (5120) at non-singleton dimension 1. Target sizes: [13696, 2560]. Tensor sizes: [6848, 5120]
[0] free(): invalid pointer
[0]
[0] LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)[0]
[0] LIBXSMM_TARGET: spr [Intel(R) Xeon(R) Gold 6438N]
[0] Registry and code: 13 MB[0]
[0] Command: python [0] deepspee[0] d_autot[0] p.py --[0] repo-id[0] -or-mode[0] l-path[0] /root[0] /ipex-[0] llm/Qw[0] en1.5-[0] 14B-Chat[0] --low[0] -bit sy[0] m_int4[0]
[0] Uptime: 35.240733 s
[1] free(): invalid size
[1]
[1] LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)
[1] LIBXSMM_TARGET: spr [Intel(R) Xeon(R) Gold 6438N]
[1] Registry and code: 13 MB
[1] Command: python deepspeed_autotp.py --repo-id-or-model-path /root/ipex-llm/Qwen1.5-14B-Chat --low-bit sym_int4[1]
[1] Uptime: 35.150173 s
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 1676750 RUNNING AT test-server
= KILLED BY SIGNAL: 6 (Aborted)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 1 PID 1676751 RUNNING AT test-server
= KILLED BY SIGNAL: 6 (Aborted)
The text was updated successfully, but these errors were encountered: