Dynamic KV cache allocation #6099
Triggered via pull request
December 24, 2024 13:30
Status
Success
Total duration
35m 38s
Artifacts
–
causal_lm_cpp.yml
on: pull_request
Matrix: cpp-beam_search_causal_lm-ubuntu
cpp-multinomial-greedy_causal_lm-ubuntu
14m 10s
cpp-greedy_causal_lm-windows
34m 44s
cpp-greedy_causal_lm-Qwen-7B-Chat
10m 30s
cpp-beam_search_causal_lm-Qwen1_5-7B-Chat
32m 45s
cpp-beam_search_causal_lm-Phi-2
16m 35s
cpp-beam_search_causal_lm-notus-7b-v1
30m 57s
cpp-speculative_decoding_lm-ubuntu
14m 19s
cpp-prompt_lookup_decoding_lm-ubuntu
8m 49s
cpp-Phi-1_5
8m 20s
cpp-greedy_causal_lm-redpajama-3b-chat
11m 6s
cpp-chat_sample-ubuntu
15m 23s
visual_language_chat_sample-ubuntu-minicpm_v2_6
7m 29s
visual_language_chat_sample-ubuntu-llava_1_5
/
visual_language_chat_sample-ubuntu-llava
30m 55s
visual_language_chat_sample-ubuntu-llava_next
/
visual_language_chat_sample-ubuntu-llava
33m 32s
visual_language_chat_sample-ubuntu-internvl2
24m 38s
cpp-continuous-batching-ubuntu
15m 31s
cpp-continuous-batching-windows
23m 36s
cpp-continuous-batching-macos
21m 25s
ci/gha_overall_status_causal_lm
0s
Annotations
1 warning
ci/gha_overall_status_causal_lm
ubuntu-latest pipelines will use ubuntu-24.04 soon. For more details, see https://github.com/actions/runner-images/issues/10636
|