deepspeed-chat: train v_head when only optimizing lora #758

mosheisland · 2023-10-05T05:45:03Z

When using only optimize lora, we still need to train the v_head parameter.

Change-Id: I252c3ee69819997bf336482c6779b070f2e76df8

When using only optimize lora, we still need to train the v_head parameter. Change-Id: I252c3ee69819997bf336482c6779b070f2e76df8 Signed-off-by: Moshe Island <[email protected]>

mosheisland requested review from jeffra, samyam, tjruwase, ShadenSmith, conglongli, awan-10, eltonzheng, minjiaz, RezaYazdaniAminabadi, duli2012, mrwyattii, yaozhewei, arashb and xiaoxiawu-microsoft as code owners October 5, 2023 05:45

tjruwase requested review from lekurile and removed request for arashb, ShadenSmith, jeffra, duli2012, samyam, conglongli, awan-10, mrwyattii, yaozhewei, eltonzheng, minjiaz, RezaYazdaniAminabadi and xiaoxiawu-microsoft October 5, 2023 20:33

deepspeed-chat: train v_head when only optimizing lora

679e800

When using only optimize lora, we still need to train the v_head parameter. Change-Id: I252c3ee69819997bf336482c6779b070f2e76df8 Signed-off-by: Moshe Island <[email protected]>

mosheisland force-pushed the 8_train_v_head_lora branch from 424049f to 679e800 Compare October 12, 2023 12:47

lekurile approved these changes Oct 13, 2023

View reviewed changes

Merge branch 'master' into 8_train_v_head_lora

761366d

tjruwase merged commit 5161c0f into microsoft:master Oct 16, 2023
2 checks passed

mosheisland deleted the 8_train_v_head_lora branch October 17, 2023 06:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deepspeed-chat: train v_head when only optimizing lora #758

deepspeed-chat: train v_head when only optimizing lora #758

mosheisland commented Oct 5, 2023

deepspeed-chat: train v_head when only optimizing lora #758

deepspeed-chat: train v_head when only optimizing lora #758

Conversation

mosheisland commented Oct 5, 2023