Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increasing the token-length based on available memory for GPT models #2280

Conversation

RezaYazdaniAminabadi
Copy link
Contributor

No description provided.

@mayank31398
Copy link
Contributor

Running the new PR
queries = ["cat " * 2000]*4
for max_new_tokens = 10, generated_tokens = [10,10,10,10]
for max_new_tokens = 100, generated_tokens = [100,100,100,99]
for 300 -> [299, 300, 299, 297]
for 1500 -> [1500, 1276, 1500, 369

@mayank31398
Copy link
Contributor

Hi, any update on the above @RezaYazdaniAminabadi ^^?
Were you able to find the error?

@RezaYazdaniAminabadi
Copy link
Contributor Author

Hi @mayank31398,

Looking into it right now, let me first merge this to another PR. I will let you know.
Thanks,
Reza

@RezaYazdaniAminabadi RezaYazdaniAminabadi changed the base branch from master to cholmes/fix-long-seq-len-inference September 12, 2022 16:28
@RezaYazdaniAminabadi RezaYazdaniAminabadi merged commit d8f5203 into cholmes/fix-long-seq-len-inference Sep 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants