[Feature] `min_p` sampling parameter #1745

josephrocca · 2024-06-08T13:19:43Z

Motivation

The min_p sampling parameter is becoming quite popular. It's conceptually simple and "makes sense", and (at least anecdotally, according to opinions of many model fine-tuners and users in the LocalLlama community) it tends to perform better than the usual top_p+top_k approach. You can see the readmes of HF repositories of many new model finetunes/merges recommend to use min_p instead of top_p and top_k.

Related resources

vLLM: https://github.com/vllm-project/vllm/blob/8ea5e44a435e8731fd6f5ba4c329dd112752532a/vllm/sampling_params.py#L64C9-L66C57

min_p: Float that represents the minimum probability for a token to be considered, relative to the probability of the most likely token. Must be in [0, 1]. Set to 0 to disable this.

So e.g. a min_p of 0.07 means that if a token's probability is less than 7% of the size of the highest-probability token, it will be disqualified. A min_p of 0.5 would mean that if a token's probability is not at least half the size of the highest-probability token, then it is disqualified. Said another way, min_p allows you to set a minimum fraction of the most likely token's probability, else the token cannot be sampled.

Please see the above links for more info.

The text was updated successfully, but these errors were encountered:

lvhan028 · 2024-06-12T04:15:50Z

@irexyc Please put this feature in the work list

josephrocca · 2024-08-31T15:34:02Z

Wondering if there's any chance this could get implemented soon? Currently supported in vLLM, SGLang, llama.cpp, text-generation-webui, with increasing usage across the community. Seems to basically a "free" intelligence boost without hurting creativity.

It basically allows to drop off tokens after any sudden/large "cliff"/drop in the probability distribution. To be clear, this isn't a small improvement - it has a non-trivial impact on output quality.

lvhan028 · 2024-09-02T13:44:25Z

Hi, @josephrocca
@irexyc used to work it out in PR #1966
But this PR involves other features and improvements, making it hard to review.
So, @irexyc is splitting the PR into smaller ones.
I think min_p will be supported soon.
Stay tuned.

josephrocca · 2024-09-14T23:46:40Z

Thanks! I've tested this, and there are no issues as far as I can tell, so I think this can now be closed. If I come across any issues I'll re-open. To test it via the official docker image, I had to add min_p as an additional field here:

lmdeploy/lmdeploy/serve/openai/api_server.py

Line 610 in e2aa4bd

top_p=request.top_p,

And here:

lmdeploy/lmdeploy/serve/openai/protocol.py

Line 254 in e2aa4bd

seed: Optional[int] = None

lvhan028 assigned irexyc Jun 12, 2024

irexyc mentioned this issue Jul 9, 2024

support min_p sampling & do_sample setting #1966

Closed

irexyc mentioned this issue Sep 4, 2024

support min_p sampling parameter #2420

Merged

josephrocca closed this as completed Sep 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] `min_p` sampling parameter #1745

[Feature] `min_p` sampling parameter #1745

josephrocca commented Jun 8, 2024 •

edited

Loading

lvhan028 commented Jun 12, 2024

josephrocca commented Aug 31, 2024 •

edited

Loading

lvhan028 commented Sep 2, 2024 •

edited

Loading

josephrocca commented Sep 14, 2024

[Feature] min_p sampling parameter #1745

[Feature] min_p sampling parameter #1745

Comments

josephrocca commented Jun 8, 2024 • edited Loading

Motivation

Related resources

lvhan028 commented Jun 12, 2024

josephrocca commented Aug 31, 2024 • edited Loading

lvhan028 commented Sep 2, 2024 • edited Loading

josephrocca commented Sep 14, 2024

[Feature] `min_p` sampling parameter #1745

[Feature] `min_p` sampling parameter #1745

josephrocca commented Jun 8, 2024 •

edited

Loading

josephrocca commented Aug 31, 2024 •

edited

Loading

lvhan028 commented Sep 2, 2024 •

edited

Loading