This repository contains the code for the paper "Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process".
The code is based on the eric-mitchell/direct-preference-optimization repository.
pip install -r requirements.txt
bash commands/run_mistral_ift.sh
-
Temporal Residual Connection
:lambda_schedule
: The schedule mode oflambda
. The default value is set tonull
, which means the static mode.linear
mode is also provided for the dynamic mode.min_lambda
&max_lambda
: The minimum value oflambda
. The default value of both is set to 0.2, which means the static mode. If thelambda_schedule
is set tolinear
, themin_lambda
andmax_lambda
will be used to control the start and end value oflambda
during training.lambda_disturb
: The disturbance distribution oflambda
. The default value is set tonull
, which means no disturbance.normal
mode is also provided for the disturbance distribution.disturb_std
: The standard deviation of thelambda_disturb
. This hyperparameter is only worked when thelambda_disturb
is notnull
.
-
Relation Propagation
:gamma
: The decay factor of the Relation Propagation. The default value is set to 0.95.propagation_type
: The variable attribute to Relation Propagation. The default value is set toloss
.mask
andlogps
are also provided for the variable attribute.propagation_side
: The side of the Relation Propagation. The default value is set toleft
.right
is also provided for the side of the Relation Propagation.propagation_norm
: The normalization mode of the Relation Propagation. The default value is set toL1
.L2
,softmax
andlog
are also provided for the normalization mode.
If you find IFT useful in your research, please consider citing the following paper:
@article{
hua2024intuitive,
title={Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process},
author={Hua, Ermo and Qi, Biqing and Zhang, Kaiyan and Yu, Yue and Ding, Ning and Lv, Xingtai and Tian, Kai and Zhou, Bowen},
journal={arXiv preprint arXiv:2405.11870},
year={2024}
}