Support baichuan2 for level0 pipeline #12289

plusbang · 2024-10-29T07:46:08Z

Description

Background: https://github.com/analytics-zoo/nano/issues/1706#issuecomment-2443913860

3. Summary of the change

Fix embedding when padded idx is not None
Move embedding and lmhead to common
Add pipeline input parameter for different model type
Add baichuan related change (change to int64 node, generate ir and blob, convert)
Add example

4. How to test?

Application test

rnwang04 · 2024-10-29T11:13:06Z

python/llm/example/NPU/HF-Transformers-AutoModels/LLM/Pipeline-Models/baichuan2.py

+                        help='Prompt to infer')
+    parser.add_argument("--n-predict", type=int, default=32, help="Max tokens to predict")
+    parser.add_argument("--max-context-len", type=int, default=1024)
+    parser.add_argument("--quantization_group_size", type=int, default=0)


Maybe remove this argument for now as we have not support GW for baichuan2 ?

Maybe remove this argument for now as we have not support GW for baichuan2 ?

Sure, have removed.

rnwang04

others LGTM

plusbang · 2024-10-29T11:24:10Z

Merge it first for other model integration. Please let me know if you have any concern or comment @jason-dai : )

plusbang added 3 commits October 29, 2024 15:12

change node dtype

c87d368

generate ir and blob, fix embedding

57d2366

refactor baichuan

8407f1a

plusbang changed the title ~~[NPU L0] Support baichuan2 for level0 pipeline~~ [WIP] Support baichuan2 for level0 pipeline Oct 29, 2024

plusbang added 6 commits October 29, 2024 15:50

fix

a62a77e

fix norm as const

a530aff

add cpp parameter

386d04b

fix

3333b6e

fix code style

2a2dc4c

add example

68fb9da

plusbang changed the title ~~[WIP] Support baichuan2 for level0 pipeline~~ Support baichuan2 for level0 pipeline Oct 29, 2024

plusbang requested review from rnwang04, hkvision and jason-dai October 29, 2024 10:35

plusbang marked this pull request as ready for review October 29, 2024 10:35

rnwang04 reviewed Oct 29, 2024

View reviewed changes

rnwang04 approved these changes Oct 29, 2024

View reviewed changes

rm

d702140

plusbang merged commit 3feb58d into intel-analytics:main Oct 29, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support baichuan2 for level0 pipeline #12289

Support baichuan2 for level0 pipeline #12289

plusbang commented Oct 29, 2024 •

edited

Loading

rnwang04 Oct 29, 2024

plusbang Oct 29, 2024

rnwang04 left a comment

plusbang commented Oct 29, 2024

Support baichuan2 for level0 pipeline #12289

Support baichuan2 for level0 pipeline #12289

Conversation

plusbang commented Oct 29, 2024 • edited Loading

Description

3. Summary of the change

4. How to test?

rnwang04 Oct 29, 2024

Choose a reason for hiding this comment

plusbang Oct 29, 2024

Choose a reason for hiding this comment

rnwang04 left a comment

Choose a reason for hiding this comment

plusbang commented Oct 29, 2024

plusbang commented Oct 29, 2024 •

edited

Loading