Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM: add fuse optimization for Mistral. #9184

Merged
merged 8 commits into from
Oct 16, 2023
Merged

LLM: add fuse optimization for Mistral. #9184

merged 8 commits into from
Oct 16, 2023

Conversation

lalalapotter
Copy link
Contributor

@lalalapotter lalalapotter commented Oct 16, 2023

Description

This PR is to add fuse rope and norm optimization for Mistral-7B model.

1. Why the change?

Add fuse rope and norm optimization for XPU, to avoid unnecessary copies and decrease the latency.

2. User API changes

None.

3. How to test?

  • Unit test
  • Application test

@lalalapotter lalalapotter requested a review from hkvision October 16, 2023 07:56
@lalalapotter lalalapotter self-assigned this Oct 16, 2023
@@ -348,4 +348,10 @@ def optimize(model):
convert_forward(model,
module.AquilaRMSNorm,
llama_rms_norm_forward)
elif model.config.model_type == "mistral":
modeling_module_name = model.__class__.__module__
module = importlib.import_module(modeling_module_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where's attention forward?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Comment on lines 89 to 91
key_states,
position_ids,
"mistral")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will have style error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will check this.

@lalalapotter lalalapotter merged commit 991a747 into intel-analytics:main Oct 16, 2023
liu-shaojun pushed a commit that referenced this pull request Mar 25, 2024
* add fuse optimization for mistral.

* fix.

* fix

* fix style.

* fix.

* fix error.

* fix style.

* fix style.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants