Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples

Official code for "Flow of Reasoning:Training LLMs for Divergent Problem Solving with Minimal Examples" Also check our [Project Page]

Training & Inference

Our FoR formulates multi-step reasoning tasks as flow:

Design reward $R(s_n)$ of terminal states for different tasks.
Collect trajectories with the local search technique.
Training LLM policy $P_{F}$ with trajectory balance loss.

Code

1) Download this GitHub

git clone https://github.com/Yu-Fangxu/FoR.git

2) Prepare the environment

We recommend conda for setting up a reproducible experiment environment. We include environment.yaml for creating a working environment:

bash install.sh

3) Choose 1 of 5 tasks to run

cd BlocksWorld|Game24|prontoqa|1D-ARC|Rubik's_Cube

Check more detailed instructions in each branch.

Citation

@article{yu2024flow,
  title={Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking},
  author={Yu, Fangxu and Jiang, Lai and Kang, Haoqiang and Hao, Shibo and Qin, Lianhui},
  journal={arXiv preprint arXiv:2406.05673},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
1D-ARC		1D-ARC
BlocksWorld		BlocksWorld
Game24		Game24
Rubik's_Cube		Rubik's_Cube
images		images
prontoqa		prontoqa
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
environment.yaml		environment.yaml
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples

Training & Inference

Code

Citation

About

Releases

Packages

Contributors 3

Languages

License

Yu-Fangxu/FoR

Folders and files

Latest commit

History

Repository files navigation

Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples

Training & Inference

Code

Citation

About

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages