Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NPU] Support split lm_head for Qwen2 #12491

Merged

Conversation

Oscilloscope98
Copy link
Contributor

@Oscilloscope98 Oscilloscope98 commented Dec 3, 2024

Description

https://github.com/analytics-zoo/nano/issues/1763#issuecomment-2514266370

  • Support split lm_head for Qwen2 for Python (cpp backend)
  • Fit with Python (acc lib/L0 backend)
  • Fit with C++ API
  • Removed default mixed_precision=True in all-in-one and related examples

TODO:

  • Verify on Qwen2-7B (Python cpp backend)
  • Verify on MiniCPM-V-2_6 (Python acc lib backend)
  • Verify mixed_precision=True
  • Verify GW

@Oscilloscope98 Oscilloscope98 changed the title [NPU] Support split lm_head for Qwen2 for Python (cpp backend) [NPU] Support split lm_head for Qwen2 with Python (cpp backend) Dec 3, 2024
@Oscilloscope98 Oscilloscope98 changed the title [NPU] Support split lm_head for Qwen2 with Python (cpp backend) [NPU] Support split lm_head for Qwen2 Dec 4, 2024
Copy link
Contributor

@plusbang plusbang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other LGTM

python/llm/dev/benchmark/all-in-one/run.py Show resolved Hide resolved
Copy link
Contributor

@rnwang04 rnwang04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Oscilloscope98 Oscilloscope98 merged commit ef4028a into intel-analytics:main Dec 4, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants