LLM: add fuse optimization for Mistral. #9184

lalalapotter · 2023-10-16T07:56:55Z

Description

This PR is to add fuse rope and norm optimization for Mistral-7B model.

1. Why the change?

Add fuse rope and norm optimization for XPU, to avoid unnecessary copies and decrease the latency.

2. User API changes

None.

3. How to test?

Unit test
Application test

hkvision · 2023-10-16T08:25:06Z

python/llm/src/bigdl/llm/transformers/convert.py

@@ -348,4 +348,10 @@ def optimize(model):
        convert_forward(model,
                        module.AquilaRMSNorm,
                        llama_rms_norm_forward)
+    elif model.config.model_type == "mistral":
+        modeling_module_name = model.__class__.__module__
+        module = importlib.import_module(modeling_module_name)


where's attention forward?

hkvision · 2023-10-16T08:36:44Z

python/llm/src/bigdl/llm/transformers/models/mistral.py

+                                                                    key_states,
+                                                                    position_ids,
+                                                                    "mistral")


will have style error?

will check this.

* add fuse optimization for mistral. * fix. * fix * fix style. * fix. * fix error. * fix style. * fix style.

lalalapotter added 3 commits October 16, 2023 15:51

add fuse optimization for mistral.

5a88dd9

fix.

2dc9606

fix

77e8b2c

lalalapotter added the llm label Oct 16, 2023

lalalapotter requested a review from hkvision October 16, 2023 07:56

lalalapotter self-assigned this Oct 16, 2023

fix style.

400aabc

hkvision reviewed Oct 16, 2023

View reviewed changes

lalalapotter added 2 commits October 16, 2023 16:25

fix.

983b375

fix error.

764afeb

hkvision approved these changes Oct 16, 2023

View reviewed changes

hkvision reviewed Oct 16, 2023

View reviewed changes

lalalapotter added 2 commits October 16, 2023 16:39

fix style.

d563650

fix style.

f468efb

lalalapotter merged commit 991a747 into intel-analytics:main Oct 16, 2023

liu-shaojun pushed a commit that referenced this pull request Mar 25, 2024

LLM: add fuse optimization for Mistral. (#9184)

5ca8a85

* add fuse optimization for mistral. * fix. * fix * fix style. * fix. * fix error. * fix style. * fix style.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM: add fuse optimization for Mistral. #9184

LLM: add fuse optimization for Mistral. #9184

lalalapotter commented Oct 16, 2023 •

edited

Loading

hkvision Oct 16, 2023

lalalapotter Oct 16, 2023

hkvision Oct 16, 2023

lalalapotter Oct 16, 2023

LLM: add fuse optimization for Mistral. #9184

LLM: add fuse optimization for Mistral. #9184

Conversation

lalalapotter commented Oct 16, 2023 • edited Loading

Description

1. Why the change?

2. User API changes

3. How to test?

hkvision Oct 16, 2023

Choose a reason for hiding this comment

lalalapotter Oct 16, 2023

Choose a reason for hiding this comment

hkvision Oct 16, 2023

Choose a reason for hiding this comment

lalalapotter Oct 16, 2023

Choose a reason for hiding this comment

lalalapotter commented Oct 16, 2023 •

edited

Loading