add initial support for intel npu acceleration library #11347

MeouSker77 · 2024-06-18T07:32:20Z

Description

Add initial support for intel npu acceleration library, just keep ipex-llm's API, calling intel_npu_acceleration_library's compile function directly

2. User API changes

from ipex_llm.transformers.npu_model import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, load_in_low_bit='fp16')

ipex_llm.transformers.npu_model.AutoModelForCausalLM receives all arguments of ipex_llm.transformers.AutoModelForCausalLM, but it doesn't support most of them, so it just ignores them.

load_in_low_bit only supports sym_int4, sym_int8 and fp32

Note: import ipex will cause intel_npu_acceleration_library cannot find NPU, so we must disable ipex auto-import when use it by setting set BIGDL_IMPORT_IPEX=0

rnwang04 · 2024-06-18T07:40:17Z

python/llm/src/ipex_llm/transformers/npu_model.py

+        low_bit_to_dtype_map = {
+            'sym_int4': int4,
+            'sym_int8': int8,
+            'fp32': torch.float,


how about fp16?

it only moves Linear Computation to NPU, other OPs are still run on CPU, CPU doesn't support most fp16 operations

rnwang04

others LGTM as an initial PR.

python/llm/src/ipex_llm/transformers/npu_model.py

jason-dai · 2024-06-18T07:53:08Z

Note: import ipex will cause intel_npu_acceleration_library cannot find NPU, so we must disable ipex auto-import when use it by setting set BIGDL_IMPORT_IPEX=0

Maybe add another ENV "BIGDL_USE_NPU" to disable auto import

MeouSker77 · 2024-06-18T07:57:39Z

Maybe add another ENV "BIGDL_USE_NPU" to disable auto import

added it

jason-dai

LGTM

MeouSker77 added 3 commits June 18, 2024 14:35

add initial support for intel npu acceleration library

cf86f86

update

b37f90d

update

ac7c81e

MeouSker77 requested a review from rnwang04 June 18, 2024 07:32

rnwang04 reviewed Jun 18, 2024

View reviewed changes

rnwang04 approved these changes Jun 18, 2024

View reviewed changes

rnwang04 reviewed Jun 18, 2024

View reviewed changes

python/llm/src/ipex_llm/transformers/npu_model.py Outdated Show resolved Hide resolved

jason-dai approved these changes Jun 18, 2024

View reviewed changes

MeouSker77 merged commit 83082e5 into intel-analytics:main Jun 18, 2024
17 of 18 checks passed

MeouSker77 deleted the add-initial-support-for-npu-lib branch June 18, 2024 08:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add initial support for intel npu acceleration library #11347

add initial support for intel npu acceleration library #11347

MeouSker77 commented Jun 18, 2024 •

edited

Loading

rnwang04 Jun 18, 2024

MeouSker77 Jun 18, 2024

rnwang04 left a comment

jason-dai commented Jun 18, 2024

MeouSker77 commented Jun 18, 2024

jason-dai left a comment

add initial support for intel npu acceleration library #11347

add initial support for intel npu acceleration library #11347

Conversation

MeouSker77 commented Jun 18, 2024 • edited Loading

Description

2. User API changes

rnwang04 Jun 18, 2024

Choose a reason for hiding this comment

MeouSker77 Jun 18, 2024

Choose a reason for hiding this comment

rnwang04 left a comment

Choose a reason for hiding this comment

jason-dai commented Jun 18, 2024

MeouSker77 commented Jun 18, 2024

jason-dai left a comment

Choose a reason for hiding this comment

MeouSker77 commented Jun 18, 2024 •

edited

Loading