NPU Baichuan2 Multi- Process example #11928

jenniew · 2024-08-26T21:06:16Z

NPU Baichuan2 Multi- Process example.

How to test?

N/A
Unit test: Please manually trigger the PR Validation here by inputting the PR number (e.g., 1234). And paste your action link here once it has been successfully finished.
Application test
Document test
...

jason-dai · 2024-08-27T02:19:08Z

plusbang · 2024-08-27T02:29:00Z

python/llm/example/NPU/HF-Transformers-AutoModels/LLM/baichuan2.py

+    parser.add_argument(
+        "--repo-id-or-model-path",
+        type=str,
+        default="meta-llama/Llama-2-7b-chat-hf",


Please modify to baichuan

plusbang · 2024-08-27T02:29:59Z

python/llm/example/NPU/HF-Transformers-AutoModels/LLM/baichuan2.py

+                        help='Prompt to infer')
+    parser.add_argument("--n-predict", type=int, default=32, help="Max tokens to predict")
+    parser.add_argument("--max-output-len", type=int, default=1024)
+    parser.add_argument("--max-prompt-len", type=int, default=768)


max-prompt-len is better to be 512 by default.

plusbang · 2024-08-27T02:30:57Z

python/llm/example/NPU/HF-Transformers-AutoModels/LLM/baichuan2.py

+        trust_remote_code=True,
+        attn_implementation="eager",
+        load_in_low_bit="sym_int4",
+        enable_mp=True,


We have just updated API, please change enable_mp to optimize_model

plusbang · 2024-08-27T02:32:47Z

python/llm/src/ipex_llm/transformers/npu_models/convert_mp.py

@@ -108,3 +108,25 @@ def optimize_llm(
            prefill_runner=prefill_runner, decode_runner=decode_runner
        )
        convert_forward(model, module.MiniCPMModel, minicpm_model_forward)
+    elif model.config.model_type == "baichuan":


maybe we need to strict the check to avoid apply optimization on baichuan-13b.

plusbang · 2024-08-27T07:25:05Z

Merge it first as initial support and will open another PR to fix.

jenniew added 5 commits August 22, 2024 15:54

update

d6b4c0b

update

4332d2d

add baichuan mp

5c0b9a2

merge

9334431

clean

22042ea

plusbang reviewed Aug 27, 2024

View reviewed changes

plusbang approved these changes Aug 27, 2024

View reviewed changes

plusbang merged commit b4b6ddf into intel-analytics:main Aug 27, 2024
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NPU Baichuan2 Multi- Process example #11928

NPU Baichuan2 Multi- Process example #11928

jenniew commented Aug 26, 2024 •

edited

Loading

jason-dai commented Aug 27, 2024

plusbang Aug 27, 2024

plusbang Aug 27, 2024

plusbang Aug 27, 2024

plusbang Aug 27, 2024

plusbang commented Aug 27, 2024

NPU Baichuan2 Multi- Process example #11928

NPU Baichuan2 Multi- Process example #11928

Conversation

jenniew commented Aug 26, 2024 • edited Loading

How to test?

jason-dai commented Aug 27, 2024

plusbang Aug 27, 2024

Choose a reason for hiding this comment

plusbang Aug 27, 2024

Choose a reason for hiding this comment

plusbang Aug 27, 2024

Choose a reason for hiding this comment

plusbang Aug 27, 2024

Choose a reason for hiding this comment

plusbang commented Aug 27, 2024

jenniew commented Aug 26, 2024 •

edited

Loading