Unable to run on dGPU #10515

dyedd · 2024-03-24T09:31:25Z

hello, I tried to run the code from https://gitee.com/Pauntech/chat-glm3/blob/master/chatglm3_web_demo.py,
but I face a problem.

LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)
LIBXSMM_TARGET: adl [13th Gen Intel(R) Core(TM) i5-13600KF]
Registry and code: 13 MB
Uptime: 6.540896 s
Segmentation fault (core dumped)

you can see that it run on the cpu? but the code clearly offload to the xpu.
the result of sycl-ls：

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2023.16.12.0.12_195853.xmain-hotfix]
[opencl:cpu:1] Intel(R) OpenCL, 13th Gen Intel(R) Core(TM) i5-13600KF OpenCL 3.0 (Build 0) [2023.16.12.0.12_195853.xmain-hotfix]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO  [23.35.27191.42]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.27191]

I used the method from https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/KeyFeatures/multi_gpus_selection.html，but it still fail.

I also run the code from https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm3/streamchat.py.
but I face the problem:

Traceback (most recent call last):
  File "/home/dyedd/projects/agent/test/streamchat.py", line 59, in <module>
    output = model.generate(input_ids,
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/generation/utils.py", line 1538, in generate
    return self.greedy_search(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/generation/utils.py", line 2362, in greedy_search
    outputs = self(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 938, in forward
    transformer_outputs = self.transformer(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/bigdl/llm/transformers/models/chatglm2.py", line 167, in chatglm2_model_forward
    hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 641, in forward
    layer_ret = layer(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 545, in forward
    attention_output, kv_cache = self.self_attention(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/bigdl/llm/transformers/models/chatglm2.py", line 191, in chatglm2_attention_forward
    return forward_function(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/bigdl/llm/transformers/models/chatglm2.py", line 437, in chatglm2_attention_forward_8eb45c
    key_layer, value_layer = append_kv_cache(cache_k, cache_v, key_layer, value_layer)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/bigdl/llm/transformers/models/utils.py", line 66, in append_kv_cache
    new_cache_k[:, :, cache_k.size(2):cache_k.size(2) + key_states.size(2), :] = key_states
RuntimeError: The expanded size of the tensor (2) must match the existing size (32) at non-singleton dimension 1.  Target sizes: [1, 2, 1, 128].  Tensor sizes: [32, 1, 128]

so，how to run the code in dGPU.?

The text was updated successfully, but these errors were encountered:

jason-dai · 2024-03-24T13:27:23Z

Can you try the linux or windows quickstart to verify the installation?

dyedd · 2024-03-24T13:32:40Z

Can you try the linux or windows quickstart to verify the installation?

The installation is no problem。because the code can run well.

Oscilloscope98 · 2024-03-26T05:52:56Z

This bug will be fixed in #10540 :)

You could try again tomorrow with ipex-llm>=2.1.0b20240326. Refer to here for more installation details regarding ipex-llm. And please let us know for any further questions :)

dyedd · 2024-03-27T06:43:20Z

This bug will be fixed in #10540 :)

You could try again tomorrow with ipex-llm>=2.1.0b20240326. Refer to here for more installation details regarding ipex-llm. And please let us know for any further questions :)

NO，it still have new problem.

chatglm3/streamchat.py: --disable-stream:ipex_llm/transformers/models/chatglm2.py", line 432, in chatglm2_attention_forward_8eb45c new_cache_k[:] = cache_k RuntimeError: Native API failed. Native API returns: -6 (PI_ERROR_OUT_OF_HOST_MEMORY) -6 (PI_ERROR_OUT_OF_HOST_MEMORY);
chatglm3/streamchat.py: please notice it no disable-stream:

-------------------- Stream Chat Output --------------------
Traceback (most recent call last):
  File "/home/dyedd/projects/agent/./test/streamchat.py", line 57, in <module>
    for response, history in model.stream_chat(tokenizer, args.question, history=[]):
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 1078, in stream_chat
    response, new_history = self.process_response(response, history)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 1004, in process_response
    metadata, content = response.split("\n", maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)

dyedd · 2024-03-27T06:54:44Z

@Oscilloscope98

I also find ipex-llm can't support streamlit now.notice: this code still need replace new api

error

PackageNotFoundError: No package metadata was found for bitsandbytes
Traceback:
File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 542, in _run_script
    exec(code, module.__dict__)
File "/home/dyedd/projects/agent/test/web.py", line 31, in <module>
    tokenizer, model = get_model()
File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 210, in wrapper
    return cached_func(*args, **kwargs)
File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 239, in __call__
    return self._get_or_create_cached_value(args, kwargs)
File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 266, in _get_or_create_cached_value
    return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 322, in _handle_cache_miss
    computed_value = self._info.func(*func_args, **func_kwargs)
File "/home/dyedd/projects/agent/test/web.py", line 18, in get_model
    model = AutoModel.from_pretrained(model_path,
File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 488, in from_pretrained
    return model_class.from_pretrained(
File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2256, in from_pretrained
    quantization_config, kwargs = BitsAndBytesConfig.from_dict(
File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 189, in from_dict
    config = cls(**config_dict)
File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 118, in __init__
    self.post_init()
File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 144, in post_init
    if self.load_in_4bit and not version.parse(importlib.metadata.version("bitsandbytes")) >= version.parse(
File "/home/dyedd/.conda/envs/pt/lib/python3.10/importlib/metadata/__init__.py", line 996, in version
    return distribution(distribution_name).version
File "/home/dyedd/.conda/envs/pt/lib/python3.10/importlib/metadata/__init__.py", line 969, in distribution
    return Distribution.from_name(distribution_name)
File "/home/dyedd/.conda/envs/pt/lib/python3.10/importlib/metadata/__init__.py", line 548, in from_name
    raise PackageNotFoundError(name)

Zhangky11 · 2024-03-29T02:28:25Z

This bug will be fixed in #10540 :)
You could try again tomorrow with ipex-llm>=2.1.0b20240326. Refer to here for more installation details regarding ipex-llm. And please let us know for any further questions :)

NO，it still have new problem.

chatglm3/streamchat.py: --disable-stream:ipex_llm/transformers/models/chatglm2.py", line 432, in chatglm2_attention_forward_8eb45c new_cache_k[:] = cache_k RuntimeError: Native API failed. Native API returns: -6 (PI_ERROR_OUT_OF_HOST_MEMORY) -6 (PI_ERROR_OUT_OF_HOST_MEMORY);

chatglm3/streamchat.py: please notice it no disable-stream:
-------------------- Stream Chat Output --------------------
Traceback (most recent call last):
  File "/home/dyedd/projects/agent/./test/streamchat.py", line 57, in <module>
    for response, history in model.stream_chat(tokenizer, args.question, history=[]):
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 1078, in stream_chat
    response, new_history = self.process_response(response, history)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 1004, in process_response
    metadata, content = response.split("\n", maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)

We haven't been able to reproduce this issue yet on our Arc A770. Would you mind running the python/llm/scripts/env-check.sh script and paste the output here so that we can have more information regarding your environment?

dyedd · 2024-03-29T04:29:08Z

This bug will be fixed in #10540 :)
You could try again tomorrow with ipex-llm>=2.1.0b20240326. Refer to here for more installation details regarding ipex-llm. And please let us know for any further questions :)

NO，it still have new problem.

chatglm3/streamchat.py: --disable-stream:ipex_llm/transformers/models/chatglm2.py", line 432, in chatglm2_attention_forward_8eb45c new_cache_k[:] = cache_k RuntimeError: Native API failed. Native API returns: -6 (PI_ERROR_OUT_OF_HOST_MEMORY) -6 (PI_ERROR_OUT_OF_HOST_MEMORY);

chatglm3/streamchat.py: please notice it no disable-stream:
-------------------- Stream Chat Output --------------------
Traceback (most recent call last):
  File "/home/dyedd/projects/agent/./test/streamchat.py", line 57, in <module>
    for response, history in model.stream_chat(tokenizer, args.question, history=[]):
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 1078, in stream_chat
    response, new_history = self.process_response(response, history)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 1004, in process_response
    metadata, content = response.split("\n", maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)
We haven't been able to reproduce this issue yet on our Arc A770. Would you mind running the python/llm/scripts/env-check.sh script and paste the output here so that we can have more information regarding your environment?

NO problem.

env-check.sh

-----------------------------------------------------------------
PYTHON_VERSION=3.10.13
-----------------------------------------------------------------
transformers=4.31.0
-----------------------------------------------------------------
PyTorch is not installed. 
-----------------------------------------------------------------
ipex-llm Version: 2.1.0b20240326
-----------------------------------------------------------------
IPEX is not installed. 
-----------------------------------------------------------------
CPU Information: 
Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      39 bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             20
On-line CPU(s) list:                0-19
Vendor ID:                          GenuineIntel
Model name:                         13th Gen Intel(R) Core(TM) i5-13600KF
CPU family:                         6
Model:                              183
Thread(s) per core:                 2
Core(s) per socket:                 14
Socket(s):                          1
Stepping:                           1
CPU max MHz:                        5100.0000
CPU min MHz:                        800.0000
BogoMIPS:                           6988.80
-----------------------------------------------------------------
MemTotal:       65679356 kB
-----------------------------------------------------------------
ulimit: 
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) 0
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 256260
max locked memory           (kbytes, -l) 8209916
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 256260
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited
-----------------------------------------------------------------
Operating System: 
Ubuntu 22.04.4 LTS \n \l

-----------------------------------------------------------------
Environment Variable: 
SHELL=/bin/bash
CONDA_EXE=/usr/local/miniconda3/bin/conda
_CE_M=
LC_ADDRESS=zh_CN.UTF-8
LC_NAME=zh_CN.UTF-8
LC_MONETARY=zh_CN.UTF-8
PWD=/home/dyedd/projects/agent
LOGNAME=dyedd
XDG_SESSION_TYPE=tty
CONDA_PREFIX=/home/dyedd/.conda/envs/pt
MOTD_SHOWN=pam
HOME=/home/dyedd
LANG=en_US.UTF-8
LC_PAPER=zh_CN.UTF-8
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
CONDA_PROMPT_MODIFIER=(pt) 
SSH_CONNECTION=127.0.0.1 57220 127.0.0.1 22
LESSCLOSE=/usr/bin/lesspipe %s %s
XDG_SESSION_CLASS=user
TERM=xterm-256color
LC_IDENTIFICATION=zh_CN.UTF-8
_CE_CONDA=
LESSOPEN=| /usr/bin/lesspipe %s
USER=dyedd
CONDA_SHLVL=2
SHLVL=2
LC_TELEPHONE=zh_CN.UTF-8
LC_MEASUREMENT=zh_CN.UTF-8
XDG_SESSION_ID=702
CONDA_PYTHON_EXE=/usr/local/miniconda3/bin/python
XDG_RUNTIME_DIR=/run/user/1000
SSH_CLIENT=127.0.0.1 57220 22
CONDA_DEFAULT_ENV=pt
LC_TIME=zh_CN.UTF-8
XDG_DATA_DIRS=/usr/share/gnome:/usr/local/share:/usr/share:/var/lib/snapd/desktop
PATH=/home/dyedd/.conda/envs/pt/bin:/usr/local/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
SSH_TTY=/dev/pts/4
CONDA_PREFIX_1=/usr/local/miniconda3
LC_NUMERIC=zh_CN.UTF-8
OLDPWD=/home/dyedd/projects
_=/usr/bin/printenv
-----------------------------------------------------------------
xpu-smi is properly installed. 
-----------------------------------------------------------------
+-----------+--------------------------------------------------------------------------------------+
| Device ID | Device Information                                                                   |
+-----------+--------------------------------------------------------------------------------------+
| 0         | Device Name: Intel(R) Arc(TM) A770 Graphics                                          |
|           | Vendor Name: Intel(R) Corporation                                                    |
|           | SOC UUID: 00000000-0000-0003-0000-000856a08086                                       |
|           | PCI BDF Address: 0000:03:00.0                                                        |
|           | DRM Device: /dev/dri/card0                                                           |
|           | Function Type: physical                                                              |
+-----------+--------------------------------------------------------------------------------------+
-----------------------------------------------------------------

Zhangky11 · 2024-03-29T06:33:56Z

Based on the provided environment information, it seems that PyTorch and IPEX are not installed. Could you please set up the correct environment and then run the shell script?

Zhangky11 · 2024-03-29T06:35:32Z

You can follow the guide below to set up the environment or check if the environment is correct: https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux

dyedd · 2024-03-29T06:43:25Z

You can follow the guide below to set up the environment or check if the environment is correct: https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux

I forgot to source oneAPI environment，so the bash don't check truely.

env-check.sh

-----------------------------------------------------------------
PYTHON_VERSION=3.10.13
-----------------------------------------------------------------
transformers=4.31.0
-----------------------------------------------------------------
torch=2.1.0a0+cxx11.abi
-----------------------------------------------------------------
ipex-llm Version: 2.1.0b20240326
-----------------------------------------------------------------
/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
ipex=2.1.10+xpu
-----------------------------------------------------------------
CPU Information: 
Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      39 bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             20
On-line CPU(s) list:                0-19
Vendor ID:                          GenuineIntel
Model name:                         13th Gen Intel(R) Core(TM) i5-13600KF
CPU family:                         6
Model:                              183
Thread(s) per core:                 2
Core(s) per socket:                 14
Socket(s):                          1
Stepping:                           1
CPU max MHz:                        5100.0000
CPU min MHz:                        800.0000
BogoMIPS:                           6988.80
-----------------------------------------------------------------
MemTotal:       65679356 kB
-----------------------------------------------------------------
ulimit: 
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) 0
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 256260
max locked memory           (kbytes, -l) 8209916
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 256260
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited
-----------------------------------------------------------------
Operating System: 
Ubuntu 22.04.4 LTS \n \l

-----------------------------------------------------------------
Environment Variable: 
SHELL=/bin/bash
TBBROOT=/opt/intel/oneapi/tbb/2021.11/env/..
ONEAPI_ROOT=/opt/intel/oneapi
CONDA_EXE=/usr/local/miniconda3/bin/conda
_CE_M=
PKG_CONFIG_PATH=/opt/intel/oneapi/vtune/2024.0/include/pkgconfig/lib64:/opt/intel/oneapi/tbb/2021.11/env/../lib/pkgconfig:/opt/intel/oneapi/mpi/2021.11/lib/pkgconfig:/opt/intel/oneapi/mkl/2024.0/lib/pkgconfig:/opt/intel/oneapi/ippcp/2021.9/lib/pkgconfig:/opt/intel/oneapi/dpl/2022.3/lib/pkgconfig:/opt/intel/oneapi/dnnl/2024.0/lib/pkgconfig:/opt/intel/oneapi/dal/2024.0/lib/pkgconfig:/opt/intel/oneapi/compiler/2024.0/lib/pkgconfig:/opt/intel/oneapi/ccl/2021.11/lib/pkgconfig/:/opt/intel/oneapi/advisor/2024.0/include/pkgconfig/lib64:
USE_XETLA=OFF
LC_ADDRESS=zh_CN.UTF-8
ACL_BOARD_VENDOR_PATH=/opt/Intel/OpenCLFPGA/oneAPI/Boards
LC_NAME=zh_CN.UTF-8
FPGA_VARS_DIR=/opt/intel/oneapi/compiler/2024.0/opt/oclfpga
CCL_ROOT=/opt/intel/oneapi/ccl/2021.11
I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.11
LC_MONETARY=zh_CN.UTF-8
FI_PROVIDER_PATH=/opt/intel/oneapi/mpi/2021.11/opt/mpi/libfabric/lib/prov:/usr/lib/x86_64-linux-gnu/libfabric
DNNLROOT=/opt/intel/oneapi/dnnl/2024.0
DIAGUTIL_PATH=/opt/intel/oneapi/dpcpp-ct/2024.0/etc/dpct/sys_check/sys_check.sh:/opt/intel/oneapi/debugger/2024.0/etc/debugger/sys_check/sys_check.py:/opt/intel/oneapi/compiler/2024.0/etc/compiler/sys_check/sys_check.sh
ADVISOR_2024_DIR=/opt/intel/oneapi/advisor/2024.0
PWD=/home/dyedd/projects/agent
CCL_CONFIGURATION=cpu_gpu_dpcpp
LOGNAME=dyedd
DPL_ROOT=/opt/intel/oneapi/dpl/2022.3
XDG_SESSION_TYPE=tty
CONDA_PREFIX=/home/dyedd/.conda/envs/pt
MANPATH=/opt/intel/oneapi/mpi/2021.11/share/man:/opt/intel/oneapi/debugger/2024.0/share/man:/opt/intel/oneapi/compiler/2024.0/documentation/en/man/common:
MOTD_SHOWN=pam
HOME=/home/dyedd
GDB_INFO=/opt/intel/oneapi/debugger/2024.0/share/info/
CCL_CONFIGURATION_PATH=
LANG=en_US.UTF-8
LC_PAPER=zh_CN.UTF-8
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
SETVARS_COMPLETED=1
CONDA_PROMPT_MODIFIER=(pt) 
APM=/opt/intel/oneapi/advisor/2024.0/perfmodels
CMAKE_PREFIX_PATH=/opt/intel/oneapi/tbb/2021.11/env/..:/opt/intel/oneapi/mkl/2024.0/lib/cmake:/opt/intel/oneapi/ipp/2021.10/lib/cmake/ipp:/opt/intel/oneapi/dpl/2022.3/lib/cmake/oneDPL:/opt/intel/oneapi/dnnl/2024.0/lib/cmake:/opt/intel/oneapi/dal/2024.0:/opt/intel/oneapi/compiler/2024.0
SSH_CONNECTION=127.0.0.1 56490 127.0.0.1 22
CMPLR_ROOT=/opt/intel/oneapi/compiler/2024.0
FPGA_VARS_ARGS=
INFOPATH=/opt/intel/oneapi/debugger/2024.0/opt/debugger/lib
IPPROOT=/opt/intel/oneapi/ipp/2021.10
IPP_TARGET_ARCH=intel64
LESSCLOSE=/usr/bin/lesspipe %s %s
XDG_SESSION_CLASS=user
PYTHONPATH=/opt/intel/oneapi/advisor/2024.0/pythonapi
TERM=xterm-256color
LC_IDENTIFICATION=zh_CN.UTF-8
_CE_CONDA=
DALROOT=/opt/intel/oneapi/dal/2024.0
LESSOPEN=| /usr/bin/lesspipe %s
USER=dyedd
LIBRARY_PATH=/opt/intel/oneapi/tbb/2021.11/env/../lib/intel64/gcc4.8:/opt/intel/oneapi/mpi/2021.11/lib:/opt/intel/oneapi/mkl/2024.0/lib/:/opt/intel/oneapi/ippcp/2021.9/lib/:/opt/intel/oneapi/ipp/2021.10/lib:/opt/intel/oneapi/dpl/2022.3/lib:/opt/intel/oneapi/dnnl/2024.0/lib:/opt/intel/oneapi/dal/2024.0/lib:/opt/intel/oneapi/compiler/2024.0/lib:/opt/intel/oneapi/ccl/2021.11/lib/
DAL_MAJOR_BINARY=2
SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
CONDA_SHLVL=2
IPPCRYPTOROOT=/opt/intel/oneapi/ippcp/2021.9
IPPCP_TARGET_ARCH=intel64
SHLVL=2
LC_TELEPHONE=zh_CN.UTF-8
VTUNE_PROFILER_2024_DIR=/opt/intel/oneapi/vtune/2024.0
OCL_ICD_FILENAMES=libintelocl_emu.so:libalteracl.so:/opt/intel/oneapi/compiler/2024.0/lib/libintelocl.so
LC_MEASUREMENT=zh_CN.UTF-8
XDG_SESSION_ID=708
CONDA_PYTHON_EXE=/usr/local/miniconda3/bin/python
CLASSPATH=/opt/intel/oneapi/mpi/2021.11/share/java/mpi.jar
INTELFPGAOCLSDKROOT=/opt/intel/oneapi/compiler/2024.0/opt/oclfpga
LD_LIBRARY_PATH=/opt/intel/oneapi/tbb/2021.11/env/../lib/intel64/gcc4.8:/opt/intel/oneapi/mpi/2021.11/opt/mpi/libfabric/lib:/opt/intel/oneapi/mpi/2021.11/lib:/opt/intel/oneapi/mkl/2024.0/lib:/opt/intel/oneapi/ippcp/2021.9/lib/:/opt/intel/oneapi/ipp/2021.10/lib:/opt/intel/oneapi/dpl/2022.3/lib:/opt/intel/oneapi/dnnl/2024.0/lib:/opt/intel/oneapi/debugger/2024.0/opt/debugger/lib:/opt/intel/oneapi/dal/2024.0/lib:/opt/intel/oneapi/compiler/2024.0/opt/oclfpga/host/linux64/lib:/opt/intel/oneapi/compiler/2024.0/opt/compiler/lib:/opt/intel/oneapi/compiler/2024.0/lib:/opt/intel/oneapi/ccl/2021.11/lib/
VTUNE_PROFILER_DIR=/opt/intel/oneapi/vtune/2024.0
XDG_RUNTIME_DIR=/run/user/1000
SSH_CLIENT=127.0.0.1 56490 22
CONDA_DEFAULT_ENV=pt
MKLROOT=/opt/intel/oneapi/mkl/2024.0
LC_TIME=zh_CN.UTF-8
DAL_MINOR_BINARY=0
XDG_DATA_DIRS=/usr/share/gnome:/usr/local/share:/usr/share:/var/lib/snapd/desktop
NLSPATH=/opt/intel/oneapi/mkl/2024.0/share/locale/%l_%t/%N:/opt/intel/oneapi/compiler/2024.0/lib/locale/%l_%t/%N
PATH=/opt/intel/oneapi/vtune/2024.0/bin64:/opt/intel/oneapi/mpi/2021.11/opt/mpi/libfabric/bin:/opt/intel/oneapi/mpi/2021.11/bin:/opt/intel/oneapi/mkl/2024.0/bin/:/opt/intel/oneapi/dpcpp-ct/2024.0/bin:/opt/intel/oneapi/dev-utilities/2024.0/bin:/opt/intel/oneapi/debugger/2024.0/opt/debugger/bin:/opt/intel/oneapi/compiler/2024.0/opt/oclfpga/bin:/opt/intel/oneapi/compiler/2024.0/bin:/opt/intel/oneapi/advisor/2024.0/bin64:/home/dyedd/.conda/envs/pt/bin:/usr/local/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
SYCL_CACHE_PERSISTENT=1
INTEL_PYTHONHOME=/opt/intel/oneapi/debugger/2024.0/opt/debugger
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
SSH_TTY=/dev/pts/4
CONDA_PREFIX_1=/usr/local/miniconda3
CPATH=/opt/intel/oneapi/tbb/2021.11/env/../include:/opt/intel/oneapi/mpi/2021.11/include:/opt/intel/oneapi/mkl/2024.0/include:/opt/intel/oneapi/ippcp/2021.9/include:/opt/intel/oneapi/ipp/2021.10/include:/opt/intel/oneapi/dpl/2022.3/include:/opt/intel/oneapi/dpcpp-ct/2024.0/include:/opt/intel/oneapi/dnnl/2024.0/include:/opt/intel/oneapi/dev-utilities/2024.0/include:/opt/intel/oneapi/dal/2024.0/include/dal:/opt/intel/oneapi/compiler/2024.0/opt/oclfpga/include:/opt/intel/oneapi/ccl/2021.11/include
LC_NUMERIC=zh_CN.UTF-8
OLDPWD=/home/dyedd
_=/usr/bin/printenv
-----------------------------------------------------------------
xpu-smi is properly installed. 
-----------------------------------------------------------------
+-----------+--------------------------------------------------------------------------------------+
| Device ID | Device Information                                                                   |
+-----------+--------------------------------------------------------------------------------------+
| 0         | Device Name: Intel(R) Arc(TM) A770 Graphics                                          |
|           | Vendor Name: Intel(R) Corporation                                                    |
|           | SOC UUID: 00000000-0000-0003-0000-000856a08086                                       |
|           | PCI BDF Address: 0000:03:00.0                                                        |
|           | DRM Device: /dev/dri/card0                                                           |
|           | Function Type: physical                                                              |
+-----------+--------------------------------------------------------------------------------------+
-----------------------------------------------------------------

Zhangky11 · 2024-04-01T01:30:08Z

Sorry, we still can't reproduce the error you encountered while running chatglm3/streamchat.py. The error you mentioned is OUT_OF_HOST_MEMORY, indicating a memory overflow on the CPU, but you are actually running model inference on XPU. Therefore, could you please provide further details on the input parameters you used when running chatglm3/streamchat.py, including the question prompt, max_new_token, etc., so that we can further replicate the issue?

dyedd · 2024-04-01T02:05:30Z

Sorry, we still can't reproduce the error you encountered while running chatglm3/streamchat.py. The error you mentioned is OUT_OF_HOST_MEMORY, indicating a memory overflow on the CPU, but you are actually running model inference on XPU. Therefore, could you please provide further details on the input parameters you used when running chatglm3/streamchat.py, including the question prompt, max_new_token, etc., so that we can further replicate the issue?

My config is default，if you can‘t reproduce the error，I can provide ssh information to you？

Zhangky11 · 2024-04-01T02:30:18Z

Sure, you could leave your email address and I'll contact you.

dyedd · 2024-04-01T02:32:12Z

Sure, you could leave your email address and I'll contact you.

Zhangky11 · 2024-04-01T07:20:43Z

Please ensure that the modeling file for chatglm3 is downloaded from the official repository. You could go to ModelScope to download the corresponding file.

dyedd · 2024-04-01T07:24:44Z

Please ensure that the modeling file for chatglm3 is downloaded from the official repository. You could go to ModelScope to download the corresponding file.

Thanks, maybe I downloaded the basic model instead of the chat model

jason-dai added the user issue label Mar 24, 2024

qiuxin2012 assigned JinBridger Mar 25, 2024

glorysdj assigned sgwhat Mar 26, 2024

Oscilloscope98 mentioned this issue Mar 26, 2024

fix chatglm #10540

Merged

Oscilloscope98 unassigned JinBridger Mar 29, 2024

Oscilloscope98 assigned Zhangky11 Mar 29, 2024

dyedd closed this as completed Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to run on dGPU #10515

Unable to run on dGPU #10515

dyedd commented Mar 24, 2024 •

edited

Loading

jason-dai commented Mar 24, 2024

dyedd commented Mar 24, 2024

Oscilloscope98 commented Mar 26, 2024

dyedd commented Mar 27, 2024

dyedd commented Mar 27, 2024 •

edited

Loading

Zhangky11 commented Mar 29, 2024 •

edited by Oscilloscope98

Loading

dyedd commented Mar 29, 2024

Zhangky11 commented Mar 29, 2024

Zhangky11 commented Mar 29, 2024

dyedd commented Mar 29, 2024

Zhangky11 commented Apr 1, 2024

dyedd commented Apr 1, 2024

Zhangky11 commented Apr 1, 2024

dyedd commented Apr 1, 2024 •

edited

Loading

Zhangky11 commented Apr 1, 2024

dyedd commented Apr 1, 2024

Unable to run on dGPU #10515

Unable to run on dGPU #10515

Comments

dyedd commented Mar 24, 2024 • edited Loading

jason-dai commented Mar 24, 2024

dyedd commented Mar 24, 2024

Oscilloscope98 commented Mar 26, 2024

dyedd commented Mar 27, 2024

dyedd commented Mar 27, 2024 • edited Loading

Zhangky11 commented Mar 29, 2024 • edited by Oscilloscope98 Loading

dyedd commented Mar 29, 2024

Zhangky11 commented Mar 29, 2024

Zhangky11 commented Mar 29, 2024

dyedd commented Mar 29, 2024

Zhangky11 commented Apr 1, 2024

dyedd commented Apr 1, 2024

Zhangky11 commented Apr 1, 2024

dyedd commented Apr 1, 2024 • edited Loading

Zhangky11 commented Apr 1, 2024

dyedd commented Apr 1, 2024

dyedd commented Mar 24, 2024 •

edited

Loading

dyedd commented Mar 27, 2024 •

edited

Loading

Zhangky11 commented Mar 29, 2024 •

edited by Oscilloscope98

Loading

dyedd commented Apr 1, 2024 •

edited

Loading