Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving error finetuning stable diffusion LoRA #2548

Closed
sr5434 opened this issue Mar 3, 2023 · 17 comments
Closed

Saving error finetuning stable diffusion LoRA #2548

sr5434 opened this issue Mar 3, 2023 · 17 comments
Assignees
Labels
bug Something isn't working

Comments

@sr5434
Copy link

sr5434 commented Mar 3, 2023

Describe the bug

After the stable diffusion model is fully trained, an error occurs:

Traceback (most recent call last):
  File "train_text_to_image_lora.py", line 872, in <module>
    main()
  File "train_text_to_image_lora.py", line 825, in main
    unet.save_attn_procs(args.output_dir)
  File "/usr/local/lib/python3.8/dist-packages/diffusers/loaders.py", line 273, in save_attn_procs
    weights_no_suffix = weights_name.replace(".bin", "")
AttributeError: 'NoneType' object has no attribute 'replace'

After this error occurs, the weights for the model aren't saved.

Reproduction

I ran the below command in Google Colab:

accelerate launch train_text_to_image_lora.py --mixed_precision="fp16" --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" --dataset_name="Norod78/microsoft-fluentui-emoji-512-whitebg" --caption_column="text" --resolution=512 --train_batch_size=2 --num_train_epochs=1 --output_dir="./sd-model-finetuned-lora" --max_train_steps=1000 --checkpointing_steps=500 --learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 --seed=42

Logs

Traceback (most recent call last):
  File "train_text_to_image_lora.py", line 872, in <module>
    main()
  File "train_text_to_image_lora.py", line 825, in main
    unet.save_attn_procs(args.output_dir)
  File "/usr/local/lib/python3.8/dist-packages/diffusers/loaders.py", line 273, in save_attn_procs
    weights_no_suffix = weights_name.replace(".bin", "")
AttributeError: 'NoneType' object has no attribute 'replace'

System Info

Google colab running the latest version of diffusers.

Is this a bug, or is there something I have to do on my end to fix this?

@sr5434 sr5434 added the bug Something isn't working label Mar 3, 2023
@StarrickLiu
Copy link

I have a same issue...
How to fix it, plz?

@wangjuns
Copy link

wangjuns commented Mar 6, 2023

Me too.
It's seem that move below code upper then clean up code in diffusers/loaders.py will fix bug.

 if weights_name is None:
            if safe_serialization:
                weights_name = LORA_WEIGHT_NAME_SAFE
            else:
                weights_name = LORA_WEIGHT_NAME

@wangjuns
Copy link

wangjuns commented Mar 6, 2023

I have a same issue... How to fix it, plz?

you could bypass this bug by modify train_text_to_image_lora.py.

diff --git a/examples/dreambooth/train_dreambooth_lora.py b/examples/dreambooth/train_dreambooth_lora.py
index c9321982..db26a1bf 100644
--- a/examples/dreambooth/train_dreambooth_lora.py
+++ b/examples/dreambooth/train_dreambooth_lora.py
@@ -987,7 +987,7 @@ def main(args):
     accelerator.wait_for_everyone()
     if accelerator.is_main_process:
         unet = unet.to(torch.float32)
-        unet.save_attn_procs(args.output_dir)
+        unet.save_attn_procs(args.output_dir, weights_name='xyz.bin')

         # Final inference
         # Load previous pipeline
@@ -998,7 +998,7 @@ def main(args):
         pipeline = pipeline.to(accelerator.device)

         # load attention processors
-        pipeline.unet.load_attn_procs(args.output_dir)
+        pipeline.unet.load_attn_procs(args.output_dir, weights_name='xyz.bin')

         # run inference
         if args.validation_prompt and args.num_validation_images > 0:

@sr5434
Copy link
Author

sr5434 commented Mar 6, 2023

Thank you, that seems to have fixed the problem.

@douo
Copy link

douo commented Mar 8, 2023

pipeline.unet.load_attn_procs(args.output_dir, weights_name='xyz.bin')

It seem to be weight_name(ref) not weights_name .

@sr5434
Copy link
Author

sr5434 commented Mar 8, 2023

@wangjuns you should consider opening a PR with those changes.

@sr5434
Copy link
Author

sr5434 commented Mar 11, 2023

The error also occurs when loading weights:

AttributeError                            Traceback (most recent call last)
[<ipython-input-2-417d63d04d54>](https://localhost:8080/#) in <module>
      4 model_path = "/content/sd-model-finetuned-lora/pytorch_lora_weights.bin"
      5 pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
----> 6 pipe.unet.load_attn_procs(model_path)
      7 pipe.to("cuda")
      8 

[/usr/local/lib/python3.9/dist-packages/diffusers/loaders.py](https://localhost:8080/#) in load_attn_procs(self, pretrained_model_name_or_path_or_dict, **kwargs)
    151         model_file = None
    152         if not isinstance(pretrained_model_name_or_path_or_dict, dict):
--> 153             if (is_safetensors_available() and weight_name is None) or weight_name.endswith(".safetensors"):
    154                 if weight_name is None:
    155                     weight_name = LORA_WEIGHT_NAME_SAFE

AttributeError: 'NoneType' object has no attribute 'endswith'

@patrickvonplaten
Copy link
Contributor

Looks like we need to fix something here! @sr5434 could you maybe add a link to the trained LoRA checkpoints ? Also cc @sayakpaul could you maybe try to have a look into this? :-)

@sayakpaul
Copy link
Member

This is related to #2616.

@wfng92 proposed a nice suggestion here: #2616 (comment).

The easiest workaround for now seems to be #2616 (comment).

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Mar 13, 2023

Does this fix the problem: https://github.com/huggingface/diffusers/pull/2655/files ?

@sr5434
Copy link
Author

sr5434 commented Mar 13, 2023

@sayakpaul I tried the work around, but now the error occurs at load time. @patrickvonplaten I will send the LoRA weights as soon as I can.

@0xdigiscore
Copy link

@patrickvonplaten

!pip install git+https://github.com/huggingface/diffusers@correct_lora_saving_loading
!accelerate launch --mixed_precision="fp16" train_text_to_image_lora.py \
  --pretrained_model_name_or_path="Linaqruf/anything-v3.0" \
  --dataset_name="ethers/azuki-datasets" --caption_column="text" \
  --resolution=512 --random_flip \
  --train_batch_size=1 \
  --max_train_steps=1 \
  --checkpointing_steps=1000 \
  --validation_epochs=100 \
  --num_validation_images=0 \
  --output_dir="/content/drive/MyDrive/differ-out" \
  --learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 \
  --seed=42 \
  --max_train_samples=10

An error is still reported when the training result is saved in colab.
image

@0xdigiscore
Copy link

0xdigiscore commented Mar 14, 2023

@sayakpaul
Copy link
Member

@0xhelloweb3 might have been an improper dependency resolution problem in your Colab.

Mine worked fine: https://colab.research.google.com/gist/sayakpaul/a6640268902e3e75aef58b5c9f06f042/scratchpad.ipynb

@0xdigiscore
Copy link

@0xhelloweb3 might have been an improper dependency resolution problem in your Colab.

Mine worked fine: https://colab.research.google.com/gist/sayakpaul/a6640268902e3e75aef58b5c9f06f042/scratchpad.ipynb

Thx, he problem has been solved and it works normally

@sayakpaul
Copy link
Member

@sr5434 could you check if #2655 solves your issue and close this issue accordingly?

@sr5434
Copy link
Author

sr5434 commented Mar 17, 2023

Yes, #2655 fixes the issue. Thank you @patrickvonplaten for creating the PR.

@sr5434 sr5434 closed this as completed Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants