[DO NOT MERGE] Upstream codebase diff #470

kzawora-intel · 2024-11-06T13:49:20Z

Scope of changes:

Contiguous PA
Multi-step scheduling
Automatic prefix caching
Padding-aware scheduling/max_num_prefill_seqs
Guided decoding fixes
FP8 support (INC/w8a8/weights_load_device)
ApplyToppTopkScalar sampler optimization
LoRA/MultiLoRA support
FusedMoE support
Model changes (adding mark_steps)
Tests
FakeHPU mode
CI stuff (.jenkins, .github)
Lots of minor stuff (RNG, FSDPA flag, reduced block fragmentation)

To repro: start server: `VLLM_SKIP_WARMUP=true python -m vllm.entrypoints.openai.api_server` send a request (this works fine): ``` curl -v http://localhost:8000/v1/completions -H "Content-Type: application/json" -d '{"model": "facebook/opt-125m","prompt": "The future of AI is ","max_tokens": 100,"temperature": 0}' ``` if request has a seed it fails: ``` curl -v http://localhost:8000/v1/completions -H "Content-Type: application/json" -d '{"model": "facebook/opt-125m","prompt": "The future of AI is ","max_tokens": 100,"temperature": 0, "seed" : 37}' ``` Failure happens here: [vllm-fork/vllm/model_executor/sampling_metadata.py at habana_main · HabanaAI/vllm-fork](https://github.com/HabanaAI/vllm-fork/blob/habana_main/vllm/model_executor/sampling_metadata.py#L220) ``` if sampling_params.seed is not None: seq_group_metadata.state.generator = torch.Generator( device=device).manual_seed(sampling_params.seed) ``` `RuntimeError: Device type HPU is not supported for torch.Generator() api.` This PR fixes above issue by using htrandom [Intel Gaudi PyTorch Python API (habana_frameworks.torch) — Gaudi Documentation 1.17.1 documentation](https://docs.habana.ai/en/latest/PyTorch/Reference/Python_Packages.html?highlight=htrandom#random-number-generator-apis)

Fix one_hot bug in torch compile mode ``` > block_mapping = torch.nn.functional.one_hot(metadata.block_mapping, num_classes=batch_size) E RuntimeError: Class values must be non-negative. ../../vllm/worker/hpu_model_runner.py:311: RuntimeError ```

Due to high dynamicity on logits processing it's better to offload it completely to CPU instead of computing it on HPU.

This PR supports the unit test test_layers with LoraMask based approach

This PR enables automatic prefix caching in intel gaudi HPUs. Please refer to this [RFC](vllm-project#2614) for detailed informations about prefix caching.

Implementation of multi-step scheduling. To use the feature, pass --num_scheduler_steps=[n] as a server parameter. In my tests, best results were achieved with n==64, but this will vary depending on the model. --------- Co-authored-by: Karol Damaszke <[email protected]> Co-authored-by: jmaksymczuk <[email protected]>

This removers the need to pass VLLM_PROMPT_USE_FUSEDSDPA environment variable in order to enable FusedSDPA attention. Fallback attention can still be used if VLLM_PROMPT_USE_FUSEDSDPA=0 is provided.

Contiguous cache fetching to avoid using costly gather operation on Gaudi3. Requires changes in vllm-hpu-extension (HabanaAI/vllm-hpu-extension#17) to work. Introduces redundant calculations in decoding phase. Feature improves the performance of all tested workloads over the entire benchmark (5-12%) on Gaudi3. PR #426 further improves the performance of this feature (9-22%). Only compatible with v2-block-manager. Feature negatively impacts the performance of Gaudi2. Use VLLM_CONTIGUOUS_PA=true environment variable to enable.

This change is fixing the performance issue I have introduced in the PR #414 -- due to the usage of `torch.where` both functions have been called. Now we will run only the selected one.

Change `NaiveBlockAllocator` to use a priority queue so that we always allocate the lowest block id first. This further increases the performance of contiguous paged attention. - [ ] Add an option or env variable to enable/disable this behavior. (Not sure if this is necessary) --------- Co-authored-by: Yang Wang <[email protected]>

Adding calculation of OpenSSF Scorecard. Note: badge (visible at repo main page) will be disabled for now.

max_num_prefill_seqs parameter is used only when use_padding_aware_scheduling is True. use_padding_aware_scheduling default value is False, so max_num_prefill_seqs shouldn't be required to pass each time SchedulerConfig is initialized. Dozens of tests in tests/core are failing due to these parameters issue.

This PR implements tensor parallelism for multi-step scheduling.

0.20.2 had some changes that break lm_eval API

…a/oct_28_rebase

…_28_rebase

…#10071) Signed-off-by: Jee Jee Li <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: B-201 <[email protected]> Co-authored-by: B-201 <[email protected]>

Signed-off-by: Max de Bayser <[email protected]> Signed-off-by: Max de Bayser <[email protected]> Signed-off-by: Joe Runde <[email protected]> Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Thomas Parnell <[email protected]> Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Varad Ahirwadkar <[email protected]> Signed-off-by: Wallas Santos <[email protected]> Signed-off-by: Travis Johnson <[email protected]> Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: Yuan Zhou <[email protected]> Signed-off-by: luka <[email protected]> Signed-off-by: Alex-Brooks <[email protected]> Signed-off-by: youkaichao <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Vinay Damodaran <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: charlifu <[email protected]> Signed-off-by: Sam Stoelinga <[email protected]> Signed-off-by: Vasily Alexeev <[email protected]> Signed-off-by: Kevin-Yang <[email protected]> Signed-off-by: Abatom <[email protected]> Signed-off-by: Bill Nell <[email protected]> Signed-off-by: wangshuai09 <[email protected]> Signed-off-by: Qishuai [email protected] Signed-off-by: yuze.zyz <[email protected]> Signed-off-by: Yannick Schnider <[email protected]> Signed-off-by: Kunjan Patel <[email protected]> Signed-off-by: simon-mo <[email protected]> Signed-off-by: kevin <[email protected]> Signed-off-by: YiSheng5 <[email protected]> Signed-off-by: yan ma <[email protected]> Signed-off-by: Went-Liang <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: sasha0552 <[email protected]> Signed-off-by: mzusman <[email protected]> Signed-off-by: Prashant Gupta <[email protected]> Signed-off-by: André Jonasson <[email protected]> Signed-off-by: Gene Su <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Peter Salas <[email protected]> Signed-off-by: Nick Hill <[email protected]> Signed-off-by: Nick Hill <[email protected]> Signed-off-by: Michael Green <[email protected]> Signed-off-by: Shanshan Wang <[email protected]> Signed-off-by: Gregory Shtrasberg <[email protected]> Signed-off-by: daitran2k1 <[email protected]> Signed-off-by: MengqingCao <[email protected]> Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: Robert Shaw <[email protected]> Signed-off-by: Hissu Hyvarinen <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Linkun Chen <[email protected]> Signed-off-by: Tomer Asida <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: sasha0552 <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Li, Jiang <[email protected]> Co-authored-by: Kuntai Du <[email protected]> Co-authored-by: Daniele <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Luka Govedič <[email protected]> Co-authored-by: bnellnm <[email protected]> Co-authored-by: Kai Wu <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Shashwat Srijan <[email protected]> Co-authored-by: Robert Shaw <[email protected]> Co-authored-by: Andrew Feldman <[email protected]> Co-authored-by: afeldman-nm <[email protected]> Co-authored-by: laishzh <[email protected]> Co-authored-by: Max de Bayser <[email protected]> Co-authored-by: Max de Bayser <[email protected]> Co-authored-by: Dipika Sikka <[email protected]> Co-authored-by: Joe Runde <[email protected]> Co-authored-by: Haoyu Wang <[email protected]> Co-authored-by: Russell Bryant <[email protected]> Co-authored-by: Nick Hill <[email protected]> Co-authored-by: tomeras91 <[email protected]> Co-authored-by: Tyler Michael Smith <[email protected]> Co-authored-by: Michael Goin <[email protected]> Co-authored-by: Kunjan <[email protected]> Co-authored-by: Kunjan Patel <kunjanp_google_com@vllm.us-central1-a.c.kunjanp-gke-dev-2.internal> Co-authored-by: Cody Yu <[email protected]> Co-authored-by: Thomas Parnell <[email protected]> Co-authored-by: Chih-Chieh Yang <[email protected]> Co-authored-by: Yue Zhang <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: Andy Dai <[email protected]> Co-authored-by: Dhia Eddine Rhaiem <[email protected]> Co-authored-by: yudian0504 <[email protected]> Co-authored-by: Varad Ahirwadkar <[email protected]> Co-authored-by: youkaichao <[email protected]> Co-authored-by: Baoyuan Qi <[email protected]> Co-authored-by: Wallas Henrique <[email protected]> Co-authored-by: Travis Johnson <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: ngrozae <[email protected]> Co-authored-by: Falko1 <[email protected]> Co-authored-by: Rafael Vasquez <[email protected]> Co-authored-by: chenqianfzh <[email protected]> Co-authored-by: wangshuai09 <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: xendo <[email protected]> Co-authored-by: Jerzy Zagorski <[email protected]> Co-authored-by: gopalsarda <[email protected]> Co-authored-by: Yuan <[email protected]> Co-authored-by: Gubrud, Aaron D <[email protected]> Co-authored-by: adgubrud <[email protected]> Co-authored-by: Yuhong Guo <[email protected]> Co-authored-by: Yuhong Guo <[email protected]> Co-authored-by: Ronen Schaffer <[email protected]> Co-authored-by: Aurick Qiao <[email protected]> Co-authored-by: Jeremy Arnold <[email protected]> Co-authored-by: Lucas Wilkinson <[email protected]> Co-authored-by: yulei <[email protected]> Co-authored-by: Seth Kimmel <[email protected]> Co-authored-by: Kaunil Dhruv <[email protected]> Co-authored-by: Flex Wang <[email protected]> Co-authored-by: Mengqing Cao <[email protected]> Co-authored-by: Alex Brooks <[email protected]> Co-authored-by: Yongzao <[email protected]> Co-authored-by: Yunfei Chu <[email protected]> Co-authored-by: Vinay R Damodaran <[email protected]> Co-authored-by: Yan Ma <[email protected]> Co-authored-by: Zhuohan Li <[email protected]> Co-authored-by: litianjian <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Charlie Fu <[email protected]> Co-authored-by: Kevin H. Luu <[email protected]> Co-authored-by: Will Johnson <[email protected]> Co-authored-by: pavlo-ruban <[email protected]> Co-authored-by: Sam Stoelinga <[email protected]> Co-authored-by: ErkinSagiroglu <[email protected]> Co-authored-by: Vasiliy Alekseev <[email protected]> Co-authored-by: kakao-kevin-us <[email protected]> Co-authored-by: Kevin-Yang <[email protected]> Co-authored-by: 科英 <[email protected]> Co-authored-by: madt2709 <[email protected]> Co-authored-by: litianjian <[email protected]> Co-authored-by: Zhong Qishuai <[email protected]> Co-authored-by: tastelikefeet <[email protected]> Co-authored-by: Sven Seeberg <[email protected]> Co-authored-by: yannicks1 <[email protected]> Co-authored-by: Junichi Sato <[email protected]> Co-authored-by: Kunjan <[email protected]> Co-authored-by: Will Eaton <[email protected]> Co-authored-by: Simon Mo <[email protected]> Co-authored-by: Lily Liu <[email protected]> Co-authored-by: YiSheng5 <[email protected]> Co-authored-by: Went-Liang <[email protected]> Co-authored-by: Elfie Guo <[email protected]> Co-authored-by: Harsha vardhan manoj Bikki <[email protected]> Co-authored-by: Guillaume Calmettes <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Alexei-V-Ivanov-AMD <[email protected]> Co-authored-by: Mor Zusman <[email protected]> Co-authored-by: Prashant Gupta <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: André Jonasson <[email protected]> Co-authored-by: Pavani Majety <[email protected]> Co-authored-by: Gene Der Su <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Peter Salas <[email protected]> Co-authored-by: sroy745 <[email protected]> Co-authored-by: Michael Green <[email protected]> Co-authored-by: Nick Hill <[email protected]> Co-authored-by: Nikita Furin <[email protected]> Co-authored-by: shanshan wang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Gregory Shtrasberg <[email protected]> Co-authored-by: Yang Zheng <[email protected]> Co-authored-by: Yang Zheng(SW)(Alex) <[email protected]> Co-authored-by: Tran Quang Dai <[email protected]> Co-authored-by: Chauncey <[email protected]> Co-authored-by: hissu-hyvarinen <[email protected]> Co-authored-by: lkchen <[email protected]> Co-authored-by: Linkun Chen <[email protected]> Co-authored-by: Linkun Chen <[email protected]> Co-authored-by: Gene Der Su <[email protected]>

…led (vllm-project#10388) Signed-off-by: imkero <[email protected]>

Signed-off-by: Woosuk Kwon <[email protected]>

…ect#10383) Signed-off-by: youkaichao <[email protected]>

…odels (vllm-project#10374) Signed-off-by: Roger Wang <[email protected]>

Signed-off-by: Chendi Xue <[email protected]>

…ject#10394) Signed-off-by: Isotr0py <[email protected]>

Signed-off-by: youkaichao <[email protected]>

Signed-off-by: Kunshang Ji <[email protected]>

…ject#10403) Signed-off-by: imkero <[email protected]>

vllm-project#10392) Signed-off-by: wchen61 <[email protected]>

…m-project#10327) Signed-off-by: Isotr0py <[email protected]>

…vllm-project#10375) Signed-off-by: Hollow Man <[email protected]>

Add valid_seq_lengths to fusedsdpa - port from 1.18.0 https://github.com/HabanaAI/vllm-fork/blob/v1.18.0/vllm/attention/backends/habana_attn.py#L209

Set vllm-hpu-extension to 2542c18

…#10401) Signed-off-by: youkaichao <[email protected]>

Signed-off-by: Linkun Chen <[email protected]>

This is a bug fixed introduced by last spec_decode PR formatting commit. Fix here

…oject#10415)

This PR introduces async copying into _prepare_prompt and _prepare_decode, which makes copying faster. It also moves precompute_indices_and_offsets funtion into forward to avoid unnecessary H2D copying.

.github/workflows/codespell.yml

@@ -0,0 +1,45 @@
+name: codespell


tests/distributed/test_utils.py

+def test_stateless_process_group(worker):
+    port1 = get_open_port()
+    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
+        s.bind(("", port1))


To fix the problem, we need to bind the socket to a specific interface instead of all interfaces. This can be done by replacing the empty string ('') with a specific IP address, such as 127.0.0.1, which binds the socket to the localhost interface. This change ensures that the socket only accepts connections from the local machine, mitigating the security risk.

…rror] (#502) Fix argument incompatible issue for FP8 ``` ERROR 11-11 04:29:13 engine.py:143] File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1556, in _wrapped_call_impl ERROR 11-11 04:29:13 engine.py:143] return self._call_impl(*args, **kwargs) ERROR 11-11 04:29:13 engine.py:143] File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1606, in _call_impl ERROR 11-11 04:29:13 engine.py:143] result = forward_call(*args, **kwargs) ERROR 11-11 04:29:13 engine.py:143] TypeError: PatchedVLLMKVCache.forward() missing 2 required positional arguments: 'block_indices' and 'block_offset' ``` FIX #453 https://github.com/HabanaAI/vllm-fork/blob/habana_main/README_GAUDI.md#troubleshooting-tweaking-hpu-graphs

kzawora-intel and others added 30 commits October 28, 2024 15:52

formatting

72a2856

HPU: offload logits processing to CPU (#358)

3203bd9

Due to high dynamicity on logits processing it's better to offload it completely to CPU instead of computing it on HPU.

Lora layers (#435)

2fa54e2

This PR supports the unit test test_layers with LoraMask based approach

initial works on enabling automatic prefix caching (#162)

1dcdb37

This PR enables automatic prefix caching in intel gaudi HPUs. Please refer to this [RFC](vllm-project#2614) for detailed informations about prefix caching.

Add fp8 test to jenkins CI (#429)

a821717

Enable FusedSDPA prefill by default (#447)

79dc102

This removers the need to pass VLLM_PROMPT_USE_FUSEDSDPA environment variable in order to enable FusedSDPA attention. Fallback attention can still be used if VLLM_PROMPT_USE_FUSEDSDPA=0 is provided.

Fix default value for FSDPA (#448)

94858b5

Fix performance of top_p and top_k calculations (#449)

d3257b2

This change is fixing the performance issue I have introduced in the PR #414 -- due to the usage of `torch.where` both functions have been called. Now we will run only the selected one.

Create scorecard.yml (#431)

6643aa6

Adding calculation of OpenSSF Scorecard. Note: badge (visible at repo main page) will be disabled for now.

Enable HPUGraphs for lora long-contexts tests

0cc72b9

[CI] Add Llama2 to torch compile tests (#446)

24ba4d4

Enable HPUGraphs for lora long-contexts tests (#454)

1bb808a

Tensor parallelism for multi-step scheduling (#457)

653e56c

This PR implements tensor parallelism for multi-step scheduling.

Set tokenizers version to <0.20.2 (#460)

1033c3e

0.20.2 had some changes that break lm_eval API

Merge remote-tracking branch 'origin/habana_main' into private/kzawor…

5e56d88

…a/oct_28_rebase

Merge remote-tracking branch 'upstream/main' into private/kzawora/oct…

18f00d7

…_28_rebase

fix hpu execution

d397ba5

format.sh

4c0647f

fix type checks

c41788f

[BugFix][Habana_main][Multistep]Fix multistep deepcopy overhead (#452)

c3c0e90

[Model][LoRA]LoRA support added for LlamaEmbeddingModel (vllm-project…

2003cc3

…#10071) Signed-off-by: Jee Jee Li <[email protected]>

Set vllm-hpu-extension to 0063520 (#455)

dc5cdfb

[Model] Add Idefics3 support (vllm-project#9767)

a5bba7d

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: B-201 <[email protected]> Co-authored-by: B-201 <[email protected]>

imkero and others added 26 commits November 17, 2024 02:10

[Bugfix] Fix M-RoPE position calculation when chunked prefill is enab…

361c29e

…led (vllm-project#10388) Signed-off-by: imkero <[email protected]>

[V1] Add code owners for V1 (vllm-project#10397)

661a34f

Signed-off-by: Woosuk Kwon <[email protected]>

[2/N][torch.compile] make compilation cfg part of vllm cfg (vllm-proj…

4fd9375

…ect#10383) Signed-off-by: youkaichao <[email protected]>

[V1] Refactor model executable interface for all text-only language m…

643ecf7

…odels (vllm-project#10374) Signed-off-by: Roger Wang <[email protected]>

[CI/Build] Fix IDC hpu [Device not found] issue (vllm-project#10384)

905d0f0

Signed-off-by: Chendi Xue <[email protected]>

[Bugfix][CPU] Fix CPU embedding runner with tensor parallel (vllm-pro…

cf349c4

…ject#10394) Signed-off-by: Isotr0py <[email protected]>

[platforms] refactor cpu code (vllm-project#10402)

8d74b5a

Signed-off-by: youkaichao <[email protected]>

[Hardware] [HPU]add mark_step for hpu (vllm-project#10239)

76aab90

Signed-off-by: Kunshang Ji <[email protected]>

[Bugfix] Fix mrope_position_delta in non-last prefill chunk (vllm-pro…

80d85c5

…ject#10403) Signed-off-by: imkero <[email protected]>

[Misc] Enhance offline_inference to support user-configurable paramet… (

d1557e6

vllm-project#10392) Signed-off-by: wchen61 <[email protected]>

[Misc] Add uninitialized params tracking for AutoWeightsLoader (vll…

c4e4643

…m-project#10327) Signed-off-by: Isotr0py <[email protected]>

[Bugfix] Ignore ray reinit error when current platform is ROCm or XPU (…

47826ca

…vllm-project#10375) Signed-off-by: Hollow Man <[email protected]>

Add valid_seq_lengths to fusedsdpa - port from 1.18.0 (#509)

0011e75

Add valid_seq_lengths to fusedsdpa - port from 1.18.0 https://github.com/HabanaAI/vllm-fork/blob/v1.18.0/vllm/attention/backends/habana_attn.py#L209

Set vllm-hpu-extension to 2542c18 (#517)

c601886

Set vllm-hpu-extension to 2542c18

[4/N][torch.compile] clean up set_torch_compile_backend (vllm-project…

51bb12d

…#10401) Signed-off-by: youkaichao <[email protected]>

[VLM] Report multi_modal_placeholders in output (vllm-project#10407)

c7dec92

Signed-off-by: Linkun Chen <[email protected]>

[BUGFIX] fix worker selector non-return issue (#508)

dac5d80

This is a bug fixed introduced by last spec_decode PR formatting commit. Fix here

Use contiguous pa by default (#519)

a4e689a

Set vllm-hpu-extension to 3a60b49 (#520)

fb308c9

[Model] Remove redundant softmax when using PoolingType.STEP (vllm-pr…

01aae1c

…oject#10415)

Merge remote-tracking branch 'origin/habana_main' into HEAD

9ebcb9b

Merge remote-tracking branch 'upstream/main' into HEAD

295cabe

Add async copying to input preparation (#497)

7c5038c

This PR introduces async copying into _prepare_prompt and _prepare_decode, which makes copying faster. It also moves precompute_indices_and_offsets funtion into forward to avoid unnecessary H2D copying.

Merge remote-tracking branch 'origin/habana_main' into HEAD

8155ba7

format.sh

3400180

Nov 18 rebase (#485)

6ae5229

github-advanced-security bot found potential problems Nov 18, 2024

View reviewed changes

xuechendi and others added 2 commits November 18, 2024 16:06

Set vllm-hpu-extension to a69bb99 (#521)

2f43ebf

@@ -126,3 +126,3 @@
                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
-                    s.bind(("", port1))
+                    s.bind(("127.0.0.1", port1))
                     port2 = get_open_port()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE] Upstream codebase diff #470

[DO NOT MERGE] Upstream codebase diff #470

kzawora-intel commented Nov 6, 2024 •

edited

Loading

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

[DO NOT MERGE] Upstream codebase diff #470

Are you sure you want to change the base?

[DO NOT MERGE] Upstream codebase diff #470

Conversation

kzawora-intel commented Nov 6, 2024 • edited Loading

kzawora-intel commented Nov 6, 2024 •

edited

Loading