Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed segfault when prompt was too long #3661

Closed
wants to merge 1 commit into from

Conversation

coezbek
Copy link
Contributor

@coezbek coezbek commented Oct 18, 2023

New fix for #3550 after it turned out the previous attempt #3639 caused a performance regression with caching the prompt.

Essentially it is an off-by-one not taking into account that nextToken needs at least one free space to start generating embeddings.

@z80maniac
Copy link
Contributor

To apply this patch I had to rename last_n_tokens to ctx_sampling->prev.

After applying this patch, I can confirm that the original crash does not happen and there's no performance regression anymore.

Thank you!

ggerganov added a commit that referenced this pull request Oct 20, 2023
ggerganov added a commit that referenced this pull request Oct 20, 2023
ggerganov added a commit that referenced this pull request Oct 20, 2023
@ggerganov
Copy link
Owner

Could you do the same tests for #3696 - I think I've applied the same patch there and want to just merge that PR and close this one if it works as expected.

@z80maniac
Copy link
Contributor

For some reason, the latest commit does not give me the crash anymore, even without any patches. I then tried #3696 just in case - no crashes and no performance regressions.

ggerganov added a commit that referenced this pull request Oct 20, 2023
* sampling : refactor init to use llama_sampling_params

* llama : combine repetition, frequency and presence penalties in 1 call

* examples : remove embd-input and gptneox-wip

* sampling : rename penalty params + reduce size of "prev" vector

* sampling : add llama_sampling_print helper

* sampling : hide prev behind API and apply #3661

ggml-ci
@ggerganov
Copy link
Owner

Thanks, I've merged the other PR

@ggerganov ggerganov closed this Oct 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants