Update dockerfile for build test image #11838

gc-fu · 2024-08-19T02:12:12Z

Description

Update Dockerfile for build vllm-0.5.4 docker image.

* Enable single card sync engine * enable ipex-llm optimizations for vllm * enable optimizations for lm_head * Fix chatglm multi-reference problem * Remove duplicate layer * LLM: Update vLLM to v0.5.4 (#11746) * Enable single card sync engine * enable ipex-llm optimizations for vllm * enable optimizations for lm_head * Fix chatglm multi-reference problem * update 0.5.4 api_server * add dockerfile * fix * fix * refine * fix --------- Co-authored-by: gc-fu <[email protected]> * Add vllm-0.5.4 Dockerfile (#11838) * Update BIGDL_LLM_SDP_IGNORE_MASK in start-vllm-service.sh (#11957) * Fix vLLM not convert issues (#11817) (#11918) * Fix not convert issues * refine Co-authored-by: Guancheng Fu <[email protected]> * Fix glm4-9b-chat nan error on vllm 0.5.4 (#11969) * init * update mlp forward * fix minicpm error in vllm 0.5.4 * fix dependabot alerts (#12008) * Update 0.5.4 dockerfile (#12021) * Add vllm awq loading logic (#11987) * [ADD] Add vllm awq loading logic * [FIX] fix the module.linear_method path * [FIX] fix quant_config path error * Enable Qwen padding mlp to 256 to support batch_forward (#12030) * Enable padding mlp * padding to 256 * update style * Install 27191 runtime in 0.5.4 docker image (#12040) * fix rebase error * fix rebase error * vLLM: format for 0.5.4 rebase (#12043) * format * Update model_convert.py * Fix serving docker related modifications (#12046) * Fix undesired modifications (#12048) * fix * Refine offline_inference arguments --------- Co-authored-by: Xiangyu Tian <[email protected]> Co-authored-by: Jun Wang <[email protected]> Co-authored-by: Wang, Jian4 <[email protected]> Co-authored-by: liu-shaojun <[email protected]> Co-authored-by: Shaojun Liu <[email protected]>

Add dockerfile

c910b42

gc-fu merged commit aaac1cf into intel-analytics:ipex-vllm-mainline Aug 19, 2024

gc-fu added a commit to gc-fu/BigDL that referenced this pull request Aug 19, 2024

Add vllm-0.5.4 Dockerfile (intel-analytics#11838)

d34f9a8

gc-fu added a commit that referenced this pull request Sep 10, 2024

Add vllm-0.5.4 Dockerfile (#11838)

fd502f2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update dockerfile for build test image #11838

Update dockerfile for build test image #11838

gc-fu commented Aug 19, 2024

Update dockerfile for build test image #11838

Update dockerfile for build test image #11838

Conversation

gc-fu commented Aug 19, 2024

Description