Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove early stopping from LLaMA end-to-end benchmarking #20033

Conversation

kunal-vaishnavi
Copy link
Contributor

Description

This PR removes early stopping from the end-to-end LLaMA-2 benchmark script.

Motivation and Context

This allows models to always generate the requested number of new tokens.

@kunal-vaishnavi kunal-vaishnavi merged commit f9cddd2 into microsoft:main Mar 22, 2024
83 of 90 checks passed
YUNQIUGUO pushed a commit that referenced this pull request Mar 25, 2024
### Description
This PR removes early stopping from the end-to-end LLaMA-2 benchmark
script.

### Motivation and Context
This allows models to always generate the requested number of new
tokens.
TedThemistokleous pushed a commit to TedThemistokleous/onnxruntime that referenced this pull request May 7, 2024
…0033)

### Description
This PR removes early stopping from the end-to-end LLaMA-2 benchmark
script.

### Motivation and Context
This allows models to always generate the requested number of new
tokens.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants