-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate QRM reward models #195
Conversation
Hey @Nicolinho, which specific arg is causing an issue with it? I was wondering if we can do this in a more general way, or by adding a model config to Also, some comments on the SkyWorks dataset soon, it seems like there is some contamination. |
@natolambert Both the torch_dtype as well as the device_map gave problems for me. |
@Nicolinho have you tried other models too? Just trying to understand the device map issue on your setup. I do know handling multi-GPU better would help. Second, if the other code is no longer needed, can you remove it? Third, can you run |
To evaluate the model trained with the skywork dataset you can run
To evaluate the model trained without the skywork dataset and using Llama3 as base you can run
|
Thanks @Nicolinho ! Looks good, should be able to merge this shortly :) |
@Nicolinho do I need |
@natolambert The argument is needed, as the quantile regression head is trained in fp32. Using bfloat16 degrades the performance somewhat. |
Hi, could you please evaluate the QRM reward model: https://huggingface.co/nicolinho/QRM-Llama3.1-8B
I had to add an argument to the script, such that no model kwargs are given to the model_builder as otherwise it messes with the datatypes.
You can run the evaluation with the following command:
Thank you!