Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created new changes in the functions #7

Open
wants to merge 585 commits into
base: ipex-vllm-mainline
Choose a base branch
from

Conversation

SANKHA1
Copy link
Owner

@SANKHA1 SANKHA1 commented Oct 14, 2024

Description

1. Why the change?

2. User API changes

3. Summary of the change

4. How to test?

  • N/A
  • Unit test: Please manually trigger the PR Validation here by inputting the PR number (e.g., 1234). And paste your action link here once it has been successfully finished.
  • Application test
  • Document test
  • ...

5. New dependencies

  • New Python dependencies
    - Dependency1
    - Dependency2
    - ...
  • New Java/Scala dependencies and their license
    - Dependency1 and license1
    - Dependency2 and license2
    - ...

rnwang04 and others added 30 commits October 28, 2024 16:05
* first commit

* update example

* fix style

* update example

* embedding as const

* fix generate

* code  refactor

* meet code review

* fix style

* change max_output_len to max_context_len

* fix all-in-one

* fix example

* add check for new tokens
* except lm_head

* remove

* support gw lm_head

* update

* fix

* remove run.bat

* fix style

* support llama3
* update layernorm & code refactor

* fix style

* add common utils

* change to Pool()

* remove print
* Add ollama_quickstart.zh-CN.md

Add ollama_quickstart.zh-CN.md

* Update ollama_quickstart.zh-CN.md

Add Chinese and English switching

* Update ollama_quickstart.md

Add Chinese and English switching

* Update README.zh-CN.md

Modify the related link to ollama_quickstart.zh-CN.md

* Update ollama_quickstart.zh-CN.md

Modified based on comments.

* Update ollama_quickstart.zh-CN.md

Modified based on comments
…_size=0` (#12282)

* Initial support for quantized forward on CPU when quantization_group_size=0

* Style fix

* Style fix

* Small fix

* Small fix
* support save & load, update llama examples

* update baichuan2 example

* update readme
* except lm_head

* remove

* support gw lm_head

* update

* fix

* remove run.bat

* fix style

* support llama3

* slice -> split

* remove debug

* fix style

* add dpu
* bugfix for qlora 100 step error

* indent fix

* annotation fix
* qwen2 gw performance opt

* remove debug
* feat: change oneccl

* fix: restore llama-70b

* fix: remove tab

* fix: remove extra blank

* small fix

* add comments

* fix: add a blank space
bitsanbytes multi backend is now available and is required , otherwise would error out saying that no cuda is available
* support qwen pipeline

* update error msg

* style

* meet review

* minor
* new codegeex attn

* use kv cache

* add compress/quantize kv

* remove compress/quantize kv

* fix style check

* fix style

* fix codegeex
* fix graphrag quickstart

* fix axolotl quickstart

* fix ragflow quickstart

* fix ragflow quickstart

* fix graphrag toc

* fix comments

* fix comment

* fix comments
* prefill use sdp

* add param

* update

* fix style

* fix style

* meet comments
MeouSker77 and others added 30 commits December 19, 2024 13:40
…12564)

* Add --modelscope for more models

* minicpm

---------

Co-authored-by: ATMxsp01 <[email protected]>
* Update Dockerfile

* Update Dockerfile

* Update start-vllm-service.sh
…l2 (#12583)

* Add --modelscope option for glm-v4 and MiniCPM-V-2_6

* glm-edge

* minicpm-v-2_6:don't use model_hub=modelscope when use lowbit; internvl2

---------

Co-authored-by: ATMxsp01 <[email protected]>
* add npu support for baichuan

* Update baichuan_mp.py

* Update baichuan_mp.py
* add compresskv back for mistral

* fix

* fix
* Update open webui doc

* Resolve comments
* fix npu save

* update
* Update baichuan2.py

* style fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.