Armo-rm env set-up and data processing #35

MaxwellJryao · 2024-09-21T04:42:40Z

Hi,

I plan to reproduce the armo-rm results but haven't found the env requirements. Is it the same as the bt model?
Currently I used the bt model env but the data processing is quite slow, which is estimated to finish in 13 hours. Any suggestions?

Thanks.

WeiXiongUST · 2024-09-21T23:51:19Z

The environment for BT reward model should be enough with an additional sklearn package.

We need to inference the ~600K samples with the bt reward model so it might take a few hours.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Armo-rm env set-up and data processing #35

Armo-rm env set-up and data processing #35

MaxwellJryao commented Sep 21, 2024

WeiXiongUST commented Sep 21, 2024

Armo-rm env set-up and data processing #35

Armo-rm env set-up and data processing #35

Comments

MaxwellJryao commented Sep 21, 2024

WeiXiongUST commented Sep 21, 2024