-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Citation for OpenRLHF in relation to the XTuner RLHF code and architecture #770
Comments
Hi, @hijkzzz Thank you for your interest and feedback on xtuner-rlhf. We have indeed noticed that the design of open-rlhf includes accelerating the RLHF process by utilizing different training and inference engines simultaneously. Projects adopting similar ideas include atorch and colossalchat as well. Besides using the widely adopted separate engine design, our core project goals are twofold. First, we aim to support more RLHF algorithm innovations and explorations beyond the standard PPO by decoupling algorithms and execution in our architecture. Second, we strive to flexibly support various advanced training and inference technologies, including our self-developed training and inference engines. These two aspects are our primary objectives. Regarding the “refactoring" you mentioned, our open-source version is an iterative evolution based on a long-term internally used RLHF framework which based on Ray and has been in use since May 2023 and was long based on internevo internally. For the convenience of the open-source community, we first released a version supporting Hugging Face. The project does not directly use the code from Open-RLHF, Atorch, or ColossalChat and we can provide more technical details and explanations as needed. In our analysis and evaluations, we referenced test cases from projects like open-RLHF and deepspeed-chat and the selection criteria for these cases were to facilitate better public discussion and analysis. Currently, our PR is still under review. We will publicly disclose more detailed information in our design, usage instructions, and README documents, fully referencing and analyzing the designs of open-RLHF, colossalchat, atorch and other projects. We will express our gratitude and clearly acknowledge the inspiration and references from all open-source projects in our public documents. Please stay tuned and feel free to discuss any questions with us. The reference links for the related projects mentioned above are as follows: |
I carefully reviewed your MR, and essentially 70%+ of the core code is copied from OpenRLHF and then refactored, including the Reward Model/Critic Models/Ray Actor Group/vLLM wrapper-related codes, even the eos_indices code. Some reference links: The Ray module of Colossalchat was also developed by OpenLLMAI members . Thanks |
Thank you for your follow-up comments and we would like to further clarify our position with the following repsonse:
|
I think you should double-check. There are a lot of instances of
Ray actor group And the The same You also refactored out Overall, Xtuner RLHF is basically a refactored based on OpenRLHF. |
Such large-scale copying and refactor without citing the copyright of OpenRLHF goes against the spirit of open source and the Apache license. Especially since it was previously stated that no code of Xtuner RLHF from OpenRLHF was used, and it has been used internally since 2023. This is lying. |
Hi, @hijkzzz I am the maintainer of XTuner. Thanks a lot for your feedback on #736 #764 . I have carefully compared #736 and the OpenRLHF's code today, and recognize that there indeed are many highly similar parts, yet the source project was not acknowledged. Firstly I would like to offer my sincere apologies to you, as this PR has brought trouble to you and the developers of OpenRLHF. XTuner aims to provide an efficient and easy-to-use toolkit for LLM finetuning, and welcomes contributions from internal teams and the community. Some first-time contributors to open-source projects did not follow the license and made such a mistake. I have discussed with the developers of this PR, and we agreed to suspend the development of this PR and start a strict internal review. PRs that are not compliant with the Apache 2.0 license will definitely not be merged. XTuner benefits from excellent open-source projects like DeepSpeed and OpenRLHF, and we will also make our efforts to create a healthier open-source ecosystem. |
Hi, XTuner Team
Could you please add a citation for the source of the Ray+vLLM-based RLHF architecture - OpenRLHF, such as in the README.md file: https://github.com/InternLM/xtuner?tab=readme-ov-file#%EF%B8%8F-acknowledgement.
We noticed that most RLHF-related code, particularly the Ray RLHF architecture in XTuner, are refactored from OpenRLHF. According to the Apache License 2.0 of OpenRLHF, the original copyright statement must be included.
An example:
Related MR:
#736
#764
Thank you
The text was updated successfully, but these errors were encountered: