Fix SFT for VLM example #1865

qgallouedec · 2024-07-23T19:14:36Z

Closes #1786

For some reason, when running sft with llava 1.5 without PEFT, you get this annoying errors:

loss explosion because of high grad norm in the visual encoder; you need to specify a low max_grad_norm (0.1 at least)
when training with multiple GPUs, you get OOM error (even with batch size = 1). Not sure where it comes from

Consequently,

I removed the example script without peft
fixed the command args, in particular the use of bf16

TODO:

Add a section in SFT documentation with link to run, command, a few remarks, and evaluations results

HuggingFaceDocBuilderDev · 2024-07-23T19:19:27Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2024-07-23T20:27:23Z

Not related to this PR:
CI breaks with transformers-4.43.1
(passes with 4.39.2)

commit 8bd2ab8 Author: Quentin Gallouédec <[email protected]> Date: Sun Jul 28 14:06:19 2024 +0200 Refactor judges (#1856) * BaseJudge -> BasePairwiseJudge * hf judge asyncio * refactor judges * doc * doc * doc * memeber judge * :inherited-members: * :inherited-members: * doc * give up * judge tldr with judge class * fix rank in multithread * format * improve doc * update doc * typo doc * doc online dpo * Update judge_tldr.py --------- Co-authored-by: Quentin Gallouédec <[email protected]> commit 82b07d6 Author: Quentin Gallouédec <[email protected]> Date: Fri Jul 26 11:43:48 2024 +0200 Llama in modelling value head tests (#1878) commit 72bf6c2 Author: Quentin Gallouédec <[email protected]> Date: Fri Jul 26 11:33:07 2024 +0200 Skip BigBird save and load test until next transformers version (#1874) commit 74e54b5 Author: Edward Beeching <[email protected]> Date: Fri Jul 26 09:36:25 2024 +0200 fix online dpo example (#1879) commit 3930973 Author: Rishav Dash <[email protected]> Date: Thu Jul 25 14:17:37 2024 +0530 Bug Fix while training using SFTTrainer with DataCollatorForCompletionOnlyLM (#1861) * Bug Fix while training using SFTTrainer with DataCollatorForCompletionOnlyLM Added ```dataset_text_field``` in the SFTConfig while training * Update docs/source/sft_trainer.mdx --------- Co-authored-by: Kashif Rasul <[email protected]> commit db8e09e Author: Rishav Dash <[email protected]> Date: Thu Jul 25 14:06:57 2024 +0530 Import missing ```setup_chat_format``` (#1862) commit 1dae55f Author: elie <[email protected]> Date: Thu Jul 25 10:27:34 2024 +0200 add fsdp_qlora config and bnb_4bit_quant_storage (#1863) commit c8cef79 Author: Quentin Gallouédec <[email protected]> Date: Wed Jul 24 21:06:57 2024 +0200 arXiv to HF Papers (#1870) commit 7dcf437 Author: Kashif Rasul <[email protected]> Date: Wed Jul 24 12:27:50 2024 +0200 [online-DPO] online dpo cleanups (#1864) * online dpo cleanups * remove unused self.policy * add OnlineDPOTrainer and config to __init__.py * import from trainer * online dpo test * rename policy to model and ref_policy to ref_model * renamed internally * formatting commit 4e85bd7 Author: Costa Huang <[email protected]> Date: Thu Jul 18 14:35:31 2024 -0400 Online DPO and Online trainer refactor (#1809) * online dpo trainer based on rloo trainer * push changes * refactor * use `batch_generation` method * precommit * remove breakpoint() * quick refactor * push the current changes * quick change * refactor * use the config name as the experiment name * fix logging * update online DPO docs * push docs * increment global step so tensorboard works again. * precommit * remove unused common online trainer * add online DPO docs * quick refactor * push changes * Update docs/source/online_dpo_trainer.md Co-authored-by: Quentin Gallouédec <[email protected]> --------- Co-authored-by: Michael Noukhovitch <[email protected]> Co-authored-by: Quentin Gallouédec <[email protected]> commit c9d5636 Author: Quentin Gallouédec <[email protected]> Date: Thu Jul 18 18:28:49 2024 +0200 rm token (#1852)

commit 890232f Author: Quentin Gallouédec <[email protected]> Date: Tue Jul 30 14:29:47 2024 +0200 update example overview (#1883) Co-authored-by: Quentin Gallouédec <[email protected]> commit 9929370 Author: Clara Pohland <[email protected]> Date: Sun Jul 28 21:10:08 2024 +0200 Move BCO to separate BCOTrainer with fixes (#1869) * kto_trainer: skip KL data for BCO * kto_trainer: BCO allow no positives or no negatives in batch * kto_trainer: make RunningMoments object serializable * add BCOTrainer * fix BCO UDM for not interleaved data * kto_trainer: remove unused UDM part * bco_trainer: add tests and docs, minor fixes * code style fixes * Update docs/source/bco_trainer.mdx Co-authored-by: Kashif Rasul <[email protected]> * fix BCO UDM for bfloat16 * Update trl/trainer/bco_config.py * Update trl/trainer/bco_config.py Co-authored-by: Seungjae Jung <[email protected]> * Update trl/trainer/utils.py Co-authored-by: Seungjae Jung <[email protected]> * Update trl/trainer/bco_trainer.py Co-authored-by: Seungjae Jung <[email protected]> * Update trl/trainer/bco_config.py * Update _toctree.yml * Update trl/trainer/bco_config.py * Update trl/trainer/bco_trainer.py * RunningMoments, fix multi GPU serialization * fix tests --------- Co-authored-by: Clara Luise Pohland <[email protected]> Co-authored-by: Kashif Rasul <[email protected]> Co-authored-by: Seungjae Jung <[email protected]> commit 6171cdd Author: Quentin Gallouédec <[email protected]> Date: Sun Jul 28 15:51:38 2024 +0200 Re-add BigBird Pegasus save/load test (#1882) Co-authored-by: Quentin Gallouédec <[email protected]> commit 33d2151 Author: Quentin Gallouédec <[email protected]> Date: Sun Jul 28 15:07:10 2024 +0200 Re-add BigBird Pegasus save/load test (#1876) * skip bigbird in ci * readd big bird test * pytest parametrize * dont check the version * rm model name * re add big bird * Merge branch 'main' into readd-bigbird-save-load-test --------- Co-authored-by: Quentin Gallouédec <[email protected]> commit 8bd2ab8 Author: Quentin Gallouédec <[email protected]> Date: Sun Jul 28 14:06:19 2024 +0200 Refactor judges (#1856) * BaseJudge -> BasePairwiseJudge * hf judge asyncio * refactor judges * doc * doc * doc * memeber judge * :inherited-members: * :inherited-members: * doc * give up * judge tldr with judge class * fix rank in multithread * format * improve doc * update doc * typo doc * doc online dpo * Update judge_tldr.py --------- Co-authored-by: Quentin Gallouédec <[email protected]> commit 82b07d6 Author: Quentin Gallouédec <[email protected]> Date: Fri Jul 26 11:43:48 2024 +0200 Llama in modelling value head tests (#1878) commit 72bf6c2 Author: Quentin Gallouédec <[email protected]> Date: Fri Jul 26 11:33:07 2024 +0200 Skip BigBird save and load test until next transformers version (#1874) commit 74e54b5 Author: Edward Beeching <[email protected]> Date: Fri Jul 26 09:36:25 2024 +0200 fix online dpo example (#1879) commit 3930973 Author: Rishav Dash <[email protected]> Date: Thu Jul 25 14:17:37 2024 +0530 Bug Fix while training using SFTTrainer with DataCollatorForCompletionOnlyLM (#1861) * Bug Fix while training using SFTTrainer with DataCollatorForCompletionOnlyLM Added ```dataset_text_field``` in the SFTConfig while training * Update docs/source/sft_trainer.mdx --------- Co-authored-by: Kashif Rasul <[email protected]> commit db8e09e Author: Rishav Dash <[email protected]> Date: Thu Jul 25 14:06:57 2024 +0530 Import missing ```setup_chat_format``` (#1862) commit 1dae55f Author: elie <[email protected]> Date: Thu Jul 25 10:27:34 2024 +0200 add fsdp_qlora config and bnb_4bit_quant_storage (#1863) commit c8cef79 Author: Quentin Gallouédec <[email protected]> Date: Wed Jul 24 21:06:57 2024 +0200 arXiv to HF Papers (#1870) commit 7dcf437 Author: Kashif Rasul <[email protected]> Date: Wed Jul 24 12:27:50 2024 +0200 [online-DPO] online dpo cleanups (#1864) * online dpo cleanups * remove unused self.policy * add OnlineDPOTrainer and config to __init__.py * import from trainer * online dpo test * rename policy to model and ref_policy to ref_model * renamed internally * formatting commit 4e85bd7 Author: Costa Huang <[email protected]> Date: Thu Jul 18 14:35:31 2024 -0400 Online DPO and Online trainer refactor (#1809) * online dpo trainer based on rloo trainer * push changes * refactor * use `batch_generation` method * precommit * remove breakpoint() * quick refactor * push the current changes * quick change * refactor * use the config name as the experiment name * fix logging * update online DPO docs * push docs * increment global step so tensorboard works again. * precommit * remove unused common online trainer * add online DPO docs * quick refactor * push changes * Update docs/source/online_dpo_trainer.md Co-authored-by: Quentin Gallouédec <[email protected]> --------- Co-authored-by: Michael Noukhovitch <[email protected]> Co-authored-by: Quentin Gallouédec <[email protected]> commit c9d5636 Author: Quentin Gallouédec <[email protected]> Date: Thu Jul 18 18:28:49 2024 +0200 rm token (#1852)

examples/scripts/vsft_llava.py

qgallouedec · 2024-08-01T06:50:11Z

I'm still not sure how to evaluate, given that open-compass/VLMEvalKit is not compatible with llava 1.5. I'll do this later.
And the previous code for evaluation is not working:

## Evaluation:

To evaluate, first install the lmms-eval framework: pip install git+https://github.com/EvolvingLMMs-Lab/lmms-eval.git
then run:
accelerate launch --num_processes=8 -m lmms_eval \
    --model llava_hf \
    --model_args pretrained llava-hf/llava-1.5-7b-hf \
    --tasks mmbench \
    --batch_size 1 \
    --output_path ./logs/ \
    --log_sample

No module named 'llava.data'

Removing it until we find a working point

fix vsft example commands

9e892dd

fix use_cache and get tokenizer from processor

4517def

kashif approved these changes Jul 23, 2024

View reviewed changes

qgallouedec added 7 commits July 28, 2024 12:12

rm unused AutoTokenizer

907f9f5

add section in doc

6a71da2

Merge branch 'main' into sft_vlm_fix

181861f

simplify script

ded2386

doc

8d61214

kashif reviewed Jul 31, 2024

View reviewed changes

examples/scripts/vsft_llava.py Outdated Show resolved Hide resolved

qgallouedec and others added 5 commits July 31, 2024 15:46

use traning args

1eddc08

args instead of trianing args

512466e

fix doc

adc8a81

drop eval

767743a

Merge branch 'main' into sft_vlm_fix

e3e2b0e

rm eval section

d7b7b97

qgallouedec marked this pull request as ready for review August 1, 2024 07:02

re-add bigbirg

4f5cdd9

qgallouedec merged commit df12913 into main Aug 2, 2024
5 of 10 checks passed

qgallouedec deleted the sft_vlm_fix branch August 2, 2024 08:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix SFT for VLM example #1865

Fix SFT for VLM example #1865

qgallouedec commented Jul 23, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 23, 2024

qgallouedec commented Jul 23, 2024 •

edited

Loading

qgallouedec commented Aug 1, 2024 •

edited

Loading

Fix SFT for VLM example #1865

Fix SFT for VLM example #1865

Conversation

qgallouedec commented Jul 23, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Jul 23, 2024

qgallouedec commented Jul 23, 2024 • edited Loading

qgallouedec commented Aug 1, 2024 • edited Loading

qgallouedec commented Jul 23, 2024 •

edited

Loading

qgallouedec commented Jul 23, 2024 •

edited

Loading

qgallouedec commented Aug 1, 2024 •

edited

Loading