support Llama2-7B / Llama3-8B for NPU C++ #12431

rnwang04 · 2024-11-22T08:51:22Z

Description

Unit test: Please manually trigger the PR Validation here by inputting the PR number (e.g., 1234). And paste your action link here once it has been successfully finished.

Oscilloscope98

LGTM

rnwang04 added 2 commits November 22, 2024 16:51

support llama2

a5ed4b1

update

f301b94

rnwang04 requested a review from jason-dai November 22, 2024 09:12

rnwang04 changed the title ~~support llama2 for NPU C++~~ support Llama2-7B / Llama3-8B for NPU C++ Nov 22, 2024

Oscilloscope98 approved these changes Nov 22, 2024

View reviewed changes

support fused_layers=4 for Llama2-7B

1624418

rnwang04 merged commit 0819fad into intel-analytics:main Nov 22, 2024
1 check passed

rnwang04 deleted the support_llama_npu_cpp branch November 22, 2024 10:47