Skip to content

CUDA: Faster Mixtral prompt processing #6962

CUDA: Faster Mixtral prompt processing

CUDA: Faster Mixtral prompt processing #6962

Job Run time
5m 45s
1m 38s
3m 22s
1m 34s
1m 48s
4m 30s
5m 52s
3m 28s
3m 50s
15m 43s
2m 8s
6m 20s
5m 30s
1m 15s
4m 20s
3m 5s
5m 10s
2m 56s
4m 48s
3m 11s
3m 0s
19m 45s
9m 7s
24m 31s
4m 45s
2m 47s
0s
2h 30m 8s