CUDA: Faster Mixtral prompt processing #6962
Job | Run time |
---|---|
5m 45s | |
1m 38s | |
3m 22s | |
1m 34s | |
1m 48s | |
4m 30s | |
5m 52s | |
3m 28s | |
3m 50s | |
15m 43s | |
2m 8s | |
6m 20s | |
5m 30s | |
1m 15s | |
4m 20s | |
3m 5s | |
5m 10s | |
2m 56s | |
4m 48s | |
3m 11s | |
3m 0s | |
19m 45s | |
9m 7s | |
24m 31s | |
4m 45s | |
2m 47s | |
0s | |
2h 30m 8s |