[perf] improve next token latency when (#threads >= 2 * #heads) by sharding the head into multiple splits #111
Triggered via pull request
November 22, 2023 07:03
Status
Success
Total duration
20m 36s
Artifacts
–
xft_PR.yml
on: pull_request
build_and_simple_test
20m 25s