Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove CPU sync before Sampler (#414)
Currently before each Sampler call we have a CPU sync, which causes a host gap: <img width="226" alt="image" src="https://github.com/user-attachments/assets/4509e69b-0f16-4ac9-812e-a2a9bc43a6ad"> This PR is removing that sync, so the host gap is no longer visible: <img width="133" alt="image" src="https://github.com/user-attachments/assets/66c19e4b-d832-4955-848d-8ae4acd8d264"> NOTE: class `ApplyToppTopkScalar` still has some CPU syncs inside. It means that the biggest gain will be observed in the scenario without `top_p` or `top_k` parameters. I think it is worth to investigate if we can remove the syncs from this function too.
- Loading branch information