Prefix caching. #1541
causal_lm_cpp.yml
on: pull_request
cpp-multinomial-greedy_causal_lm-ubuntu
8m 16s
cpp-greedy_causal_lm-windows
17m 9s
cpp-beam_search_causal_lm-Qwen-7B-Chat
9m 58s
cpp-beam_search_causal_lm-Qwen1_5-7B-Chat
7m 19s
cpp-beam_search_causal_lm-Phi-2
6m 12s
cpp-beam_search_causal_lm-notus-7b-v1
7m 33s
cpp-speculative_decoding_lm-ubuntu
10m 2s
cpp-prompt_lookup_decoding_lm-ubuntu
5m 39s
cpp-Phi-1_5
8m 0s
cpp-greedy_causal_lm-redpajama-3b-chat
12m 14s
cpp-chat_sample-ubuntu
9m 59s
cpp-continuous-batching-ubuntu
12m 7s
cpp-continuous-batching-windows
20m 9s
cpp-continuous-batching-macos
16m 36s
Matrix: cpp-beam_search_causal_lm-ubuntu
Annotations
16 warnings