-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: Beam search: top_p, min_p and logit processors #10754
Comments
Hey @denadai2 thanks for reporting. I think this is a resource priority problem since in theory you could pipe through anything to the internal SamplingParams created in the Lines 97 to 101 in 4c05edb
Could you please narrow the list of potential parameters so someone can prioritize the meaningful ones? Contribution is welcome as well! |
can you join the slack https://slack.vllm.ai for collaboration? I want to re-implement beam search in the flavor of parallel sampling, but do not have bandwidth. see Line 1357 in 4c05edb
|
Yes IMO we should deprecate/remove the new beam_search method. We can make the original API work with the "external" implementation. No need for separate BeamSearchParams, most SamplingParams can be used as-is, just need to adjust those that are themselves used by the beam search logic (like num logprobs) ... #9427 (comment) |
Done! Is there a way to have a middle ground in which we do not have do do a complete refactor? |
🚀 The feature, motivation and pitch
Dear vllm community, we recently deprecated beam search from the core library, in favour of a new method called
beam_search
. However, this new method is far less powerful than before and it restrict the possibilities of applying beam search to many use cases. For example, by controlling the generation (top_p etc) or doing constrained beam search (e.g. https://huggingface.co/blog/constrained-beam-search).We, at Spotify, use 0.6.1 for this reason. I am sure that many more are doing the same. However, we would like to move to pytorch 2.5 to fully use our h100s, FSDP2 etc. Moreover, we would like to stay up-to-date with vllm
Could we consider moving these parameters in the new method as well? Thaaaaank you!
ref #6226
Alternatives
Huggingface
Additional context
No response
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: