optimize phi3 memory usage again #11867

MeouSker77 · 2024-08-20T08:51:29Z

Description

optimize phi3 memory usage again

1. Why the change?

2. User API changes

3. Summary of the change

4. How to test?

N/A
Unit test: Please manually trigger the PR Validation here by inputting the PR number (e.g., 1234). And paste your action link here once it has been successfully finished.
Application test
Document test
...

MeouSker77 · 2024-08-20T08:52:39Z

PR validation: https://github.com/intel-analytics/ipex-llm-workflow/actions/runs/10468415385

* feat: update readme for ppl test * fix: textual adjustments * fix: textual adjustments * Add ipex-llm npu option in setup.py (#11858) * add ipex-llm npu release * update example doc * meet latest release changes * optimize phi3 memory usage (#11867) * Update `ipex-llm` default transformers version to 4.37.0 (#11859) * Update default transformers version to 4.37.0 * Add dependency requirements for qwen and qwen-vl * Temp fix transformers version for these not yet verified models * Skip qwen test in UT for now as it requires transformers<4.37.0 * Update performance test regarding updated default `transformers==4.37.0` (#11869) * Update igpu performance from transformers 4.36.2 to 4.37.0 (#11841) * upgrade arc perf test to transformers 4.37 (#11842) * fix load low bit com dtype (#11832) * feat: add mixed_precision argument on ppl longbench evaluation * fix: delete extra code * feat: upgrade arc perf test to transformers 4.37 * fix: add missing codes * fix: keep perf test for qwen-vl-chat in transformers 4.36 * fix: remove extra space * fix: resolve pr comment * fix: add empty line * fix: add pip install for spr and core test * fix: delete extra comments * fix: remove python -m for pip * Revert "fix load low bit com dtype (#11832)" This reverts commit 6841a9a. --------- Co-authored-by: Zhao Changmin <[email protected]> Co-authored-by: Jinhe Tang <[email protected]> * add transformers==4.36 for qwen vl in igpu-perf (#11846) * add transformers==4.36.2 for qwen-vl * Small update --------- Co-authored-by: Yuwen Hu <[email protected]> * fix: remove qwen-7b on core test (#11851) * fix: remove qwen-7b on core test * fix: change delete to comment --------- Co-authored-by: Jinhe Tang <[email protected]> * replce filename (#11854) * fix: remove qwen-7b on core test * fix: change delete to comment * fix: replace filename --------- Co-authored-by: Jinhe Tang <[email protected]> * fix: delete extra comments (#11863) * Remove transformers installation for temp test purposes * Small fix * Small update --------- Co-authored-by: Chu,Youcheng <[email protected]> Co-authored-by: Zhao Changmin <[email protected]> Co-authored-by: Jinhe Tang <[email protected]> Co-authored-by: Zijie Li <[email protected]> Co-authored-by: Chu,Youcheng <[email protected]> * Pytorch models transformers version update (#11860) * yi sync * delete 4.34 constraint * delete 4.34 constraint * delete 4.31 constraint * delete 4.34 constraint * delete 4.35 constraint * added <=4.33.3 constraint * added <=4.33.3 constraint * switched to chinese prompt * Update compresskv model forward type logic (#11868) * update * fix * Update local import for ppl (#11866) Co-authored-by: jenniew <[email protected]> * fix: textual adjustment --------- Co-authored-by: SONG Ge <[email protected]> Co-authored-by: Yishuo Wang <[email protected]> Co-authored-by: Yuwen Hu <[email protected]> Co-authored-by: Zhao Changmin <[email protected]> Co-authored-by: Jinhe Tang <[email protected]> Co-authored-by: Zijie Li <[email protected]> Co-authored-by: Yina Chen <[email protected]> Co-authored-by: RyuKosei <[email protected]> Co-authored-by: jenniew <[email protected]>

optimize phi3 memory usage

6153be0

MeouSker77 requested a review from rnwang04 August 20, 2024 09:06

rnwang04 approved these changes Aug 20, 2024

View reviewed changes

MeouSker77 merged commit d4ee0a8 into intel-analytics:main Aug 20, 2024
1 check passed

MeouSker77 deleted the optimize-phi3-memory-usage branch August 20, 2024 09:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize phi3 memory usage again #11867

optimize phi3 memory usage again #11867

MeouSker77 commented Aug 20, 2024

MeouSker77 commented Aug 20, 2024

optimize phi3 memory usage again #11867

optimize phi3 memory usage again #11867

Conversation

MeouSker77 commented Aug 20, 2024

Description

1. Why the change?

2. User API changes

3. Summary of the change

4. How to test?

MeouSker77 commented Aug 20, 2024