Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test transformers 41 #12

Open
wants to merge 194 commits into
base: test_transformers_41
Choose a base branch
from

Conversation

SANKHA1
Copy link
Owner

@SANKHA1 SANKHA1 commented Nov 4, 2024

Description

1. Why the change?

2. User API changes

3. Summary of the change

4. How to test?

  • N/A
  • Unit test: Please manually trigger the PR Validation here by inputting the PR number (e.g., 1234). And paste your action link here once it has been successfully finished.
  • Application test
  • Document test
  • ...

5. New dependencies

  • New Python dependencies
    - Dependency1
    - Dependency2
    - ...
  • New Java/Scala dependencies and their license
    - Dependency1 and license1
    - Dependency2 and license2
    - ...

lzivan and others added 30 commits August 8, 2024 12:32
* Add openai-whisper pytorch gpu

* Update README.md

* Update README.md

* fix typo

* fix names update readme

* Update README.md
* updated qwen1.5B to all transformer==4.37 yaml

* updated qwen1.5B to all transformer==4.37 yaml
…11747)

mistral-7B-instruct-v0.2 and mistral-7B-instruct-v0.1 use different rope_theta (0.2 is 1e, 0.1 is 1e5). Pass self.config.rope_theta to apply_rotary_pos_emb_no_cache_xpu to avoid output difference.
* phi3 support compresskv

* fix phi3 mtl error

* fix conflict with quant kv

* fix abnormal on mtl

* fix style

* use slide windows size to compress kv

* support sliding window

* fix style

* fix style

* temp: partial support quant kv

* support quant kv with compress kv, todo: model check

* temp

* fix style

* fix style

* remove prepare

* address comment

* default -> 1.8k
* fix gptq of llama

* small fix
* support compress kv with lookahead

* enough kv miss param
* add perf mode

* update

* fix style
* Revert to use out-of-tree GPU driver since the performance with out-of-tree driver is better than upsteam's

* add spaces

* add troubleshooting case

* update Troubleshooting
* set mistral fuse rope to false except fp6 & fp16

* lint

* lint

---------

Co-authored-by: ATMxsp01 <[email protected]>
…#11760)

* All use 8192.txt for prompt preparation for now

* Small fix

* Fix text encoding mode to utf-8

* Small update
* fix compresskv + lookahead attn_mask qwen2

* support llama chatglm

* support mistral & chatglm

* address comments

* revert run.py
* Reduce Mistral softmax memory only in low memory mode
…est (#11778)

* add yaml and modify `concat_csv.py` for `transformers` 4.43.1 (#11758)

* add yaml and modify `concat_csv.py` for `transformers` 4.43.1

* remove 4.43 for arc; fix;

* remove 4096-512 for 4.43

* comment some models

* Small fix

* uncomment models (#11777)

---------

Co-authored-by: Ch1y0q <[email protected]>
* deepspeed zero3 QLoRA finetuning

* Update convert.py

* Update low_bit_linear.py

* Update utils.py

* Update qlora_finetune_llama2_13b_arch_2_card.sh

* Update low_bit_linear.py

* Update alpaca_qlora_finetuning.py

* Update low_bit_linear.py

* Update utils.py

* Update convert.py

* Update alpaca_qlora_finetuning.py

* Update alpaca_qlora_finetuning.py

* Update low_bit_linear.py

* Update deepspeed_zero3.json

* Update qlora_finetune_llama2_13b_arch_2_card.sh

* Update low_bit_linear.py

* Update low_bit_linear.py

* Update utils.py

* fix style

* fix style

* Update alpaca_qlora_finetuning.py

* Update qlora_finetune_llama2_13b_arch_2_card.sh

* Update convert.py

* Update low_bit_linear.py

* Update model.py

* Update alpaca_qlora_finetuning.py

* Update low_bit_linear.py

* Update low_bit_linear.py

* Update low_bit_linear.py
* Fix mistral forward_qkv without self.rotary_emb.base in q4_0.
* Replace apply_rotary_pos_emb_no_cache_xpu with rotary_half_inplaced.
* Revert #11765
* fix check error

* fix other models

* remove print
hzjane and others added 30 commits August 30, 2024 09:50
* update on readme after ipex-llm update

* update on readme after ipex-llm update

* rebase & delete redundancy

* revise

* add numbers for troubleshooting
* feat:add gptq for ppl

* fix: add an empty line

* fix: add an empty line

* fix: remove an empty line

* Resolve comments

* Resolve comments

* Resolve comments
* add initial support for minicpm-llama-v2.5

* update impl

* add minicpm-llama3-v2.5 example
* initial pr

* update npu model

* fix

* fix kv cache type

* fix

* small fix

* fix style

* fix model id

* change inter_pp=4

* address comment

* fix

* fix style

* fix

* rebase
* fix

* fix

* fix

* fix stype

* fix style

* fix style
* fix

* meet comment
* update npu readme of multimodal

* small fix

* meet comment
* Add MiniCPM-V cpu example

* fix

* fix

* fix

* fix
…s during lookup generation (#11989)

* Fix garbage output for input_embeds inputs during lookup generation

* Fix on sliding windows

* Simplify code
* Update GraphRAG QuickStart

* Further updates

* Small fixes

* Small fix
* add save &  load support

* fix style
* fix dependabot alerts

* update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.