-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: Support for Seq classification/Reward models #8700
Comments
contribution is welcome! vllm already supports embedding model. I think it is quite similar to reward model. I don't know what would be the obstacle to use vllm code to run reward models. We can pretend they are embedding models. |
Seems like a lot of the reward models use a different architecture than embedding models.
Here are two examples: |
@youkaichao Added #8740 tried to convert it to the vllm format, but running into some tensor shape issues in compute logits. Let me know if the conversion generally looks right. |
looks like #8896 already implements it. |
Hey - I've been working with reward models substantially in the open ecosystem building rewardbench, in reality most of the open models have subtly different architectures. The easiest is |
🚀 The feature, motivation and pitch
Verifier/reward models are going to be very important moving forward for building:
Could we add support for sequence classification models like Skywork/Skywork-Reward-Llama-3.1-8B
Alternatives
No response
Additional context
No response
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: