Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'NoneType' object has no attribute 'replace' #2669

Closed
killermyth opened this issue Mar 14, 2023 · 8 comments
Closed

AttributeError: 'NoneType' object has no attribute 'replace' #2669

killermyth opened this issue Mar 14, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@killermyth
Copy link

killermyth commented Mar 14, 2023

Describe the bug

I follow train dreambooth lora model guid in https://github.com/huggingface/diffusers/tree/main/examples/dreambooth#training-with-low-rank-adaptation-of-large-language-models-lora
when I run accelerate launch train_dreambooth_lora.py, I got error like below

Steps: 100%|█████████████████████████████████████████| 500/500 [03:29<00:00, 3.37it/s, loss=0.178, lr=0.0001]Traceback (most recent call last):
File "/home/jovyan/lora/diffusers/examples/dreambooth/train_dreambooth_lora.py", line 1045, in
main(args)
File "/home/jovyan/lora/diffusers/examples/dreambooth/train_dreambooth_lora.py", line 995, in main
unet.save_attn_procs(args.output_dir)
File "/home/jovyan/lora/diffusers/src/diffusers/loaders.py", line 273, in save_attn_procs
weights_no_suffix = weights_name.replace(".bin", "")
AttributeError: 'NoneType' object has no attribute 'replace'

Reproduction

accelerate launch train_dreambooth_lora.py
--pretrained_model_name_or_path=$MODEL_NAME
--instance_data_dir=$INSTANCE_DIR
--output_dir=$OUTPUT_DIR
--instance_prompt="a photo of sks dog"
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=1
--checkpointing_steps=100
--learning_rate=1e-4
--report_to="wandb"
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=500
--validation_prompt="A photo of sks dog in a bucket"
--validation_epochs=50
--seed="0"

Logs

03/14/2023 19:25:32 - INFO - __main__ - ***** Running training *****
03/14/2023 19:25:32 - INFO - __main__ -   Num examples = 5
03/14/2023 19:25:32 - INFO - __main__ -   Num batches each epoch = 5
03/14/2023 19:25:32 - INFO - __main__ -   Num Epochs = 100
03/14/2023 19:25:32 - INFO - __main__ -   Instantaneous batch size per device = 1
03/14/2023 19:25:32 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 1
03/14/2023 19:25:32 - INFO - __main__ -   Gradient Accumulation steps = 1
03/14/2023 19:25:32 - INFO - __main__ -   Total optimization steps = 500
Steps:   1%|| 5/500 [00:03<03:32,  2.33it/s, loss=0.0734, lr=0.0001]03/14/2023 19:25:36 - INFO - __main__ - Running validation...
 Generating 4 images with prompt: A photo of sks dog in a bucket.
{'requires_safety_checker'} was not found in config. Values will be initialized to default values.
/home/jovyan/my-conda-envs/dreambooth/lib/python3.10/site-packages/transformers/models/clip/feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead.
  warnings.warn(
{'scaling_factor'} was not found in config. Values will be initialized to default values.
{'prediction_type'} was not found in config. Values will be initialized to default values.
{'sample_max_value', 'solver_type', 'lower_order_final', 'thresholding', 'algorithm_type', 'solver_order', 'dynamic_thresholding_ratio'} was not found in config. Values will be initialized to default values.
Steps:  20%|████████                                | 100/500 [00:50<01:57,  3.39it/s, loss=0.0164, lr=0.0001]03/14/2023 19:26:23 - INFO - accelerate.accelerator - Saving current state to /home/jovyan/lora/output-model/checkpoint-100
03/14/2023 19:26:23 - INFO - accelerate.checkpointing - Model weights saved in /home/jovyan/lora/output-model/checkpoint-100/pytorch_model.bin
03/14/2023 19:26:23 - INFO - accelerate.checkpointing - Optimizer state saved in /home/jovyan/lora/output-model/checkpoint-100/optimizer.bin
03/14/2023 19:26:23 - INFO - accelerate.checkpointing - Scheduler state saved in /home/jovyan/lora/output-model/checkpoint-100/scheduler.bin
03/14/2023 19:26:23 - INFO - accelerate.checkpointing - Gradient scaler state saved in /home/jovyan/lora/output-model/checkpoint-100/scaler.pt
03/14/2023 19:26:23 - INFO - accelerate.checkpointing - Random states saved in /home/jovyan/lora/output-model/checkpoint-100/random_states_0.pkl
03/14/2023 19:26:23 - INFO - accelerate.checkpointing - Saving the state of AttnProcsLayers to /home/jovyan/lora/output-model/checkpoint-100/custom_checkpoint_0.pkl
03/14/2023 19:26:24 - INFO - __main__ - Saved state to /home/jovyan/lora/output-model/checkpoint-100
Steps:  40%|████████████████▍                        | 200/500 [01:24<01:27,  3.43it/s, loss=0.014, lr=0.0001]03/14/2023 19:26:57 - INFO - accelerate.accelerator - Saving current state to /home/jovyan/lora/output-model/checkpoint-200
03/14/2023 19:26:57 - INFO - accelerate.checkpointing - Model weights saved in /home/jovyan/lora/output-model/checkpoint-200/pytorch_model.bin
03/14/2023 19:26:57 - INFO - accelerate.checkpointing - Optimizer state saved in /home/jovyan/lora/output-model/checkpoint-200/optimizer.bin
03/14/2023 19:26:57 - INFO - accelerate.checkpointing - Scheduler state saved in /home/jovyan/lora/output-model/checkpoint-200/scheduler.bin
03/14/2023 19:26:57 - INFO - accelerate.checkpointing - Gradient scaler state saved in /home/jovyan/lora/output-model/checkpoint-200/scaler.pt
03/14/2023 19:26:57 - INFO - accelerate.checkpointing - Random states saved in /home/jovyan/lora/output-model/checkpoint-200/random_states_0.pkl
03/14/2023 19:26:57 - INFO - accelerate.checkpointing - Saving the state of AttnProcsLayers to /home/jovyan/lora/output-model/checkpoint-200/custom_checkpoint_0.pkl
03/14/2023 19:26:58 - INFO - __main__ - Saved state to /home/jovyan/lora/output-model/checkpoint-200
Steps:  51%|████████████████████▉                    | 255/500 [01:45<01:20,  3.04it/s, loss=0.193, lr=0.0001]03/14/2023 19:27:18 - INFO - __main__ - Running validation...
 Generating 4 images with prompt: A photo of sks dog in a bucket.
{'requires_safety_checker'} was not found in config. Values will be initialized to default values.
{'scaling_factor'} was not found in config. Values will be initialized to default values.
{'prediction_type'} was not found in config. Values will be initialized to default values.
{'sample_max_value', 'solver_type', 'lower_order_final', 'thresholding', 'algorithm_type', 'solver_order', 'dynamic_thresholding_ratio'} was not found in config. Values will be initialized to default values.
Steps:  60%|███████████████████████▍               | 300/500 [02:17<00:58,  3.39it/s, loss=0.00226, lr=0.0001]03/14/2023 19:27:50 - INFO - accelerate.accelerator - Saving current state to /home/jovyan/lora/output-model/checkpoint-300
03/14/2023 19:27:50 - INFO - accelerate.checkpointing - Model weights saved in /home/jovyan/lora/output-model/checkpoint-300/pytorch_model.bin
03/14/2023 19:27:50 - INFO - accelerate.checkpointing - Optimizer state saved in /home/jovyan/lora/output-model/checkpoint-300/optimizer.bin
03/14/2023 19:27:50 - INFO - accelerate.checkpointing - Scheduler state saved in /home/jovyan/lora/output-model/checkpoint-300/scheduler.bin
03/14/2023 19:27:50 - INFO - accelerate.checkpointing - Gradient scaler state saved in /home/jovyan/lora/output-model/checkpoint-300/scaler.pt
03/14/2023 19:27:50 - INFO - accelerate.checkpointing - Random states saved in /home/jovyan/lora/output-model/checkpoint-300/random_states_0.pkl
03/14/2023 19:27:50 - INFO - accelerate.checkpointing - Saving the state of AttnProcsLayers to /home/jovyan/lora/output-model/checkpoint-300/custom_checkpoint_0.pkl
03/14/2023 19:27:51 - INFO - __main__ - Saved state to /home/jovyan/lora/output-model/checkpoint-300
Steps:  80%|████████████████████████████████▊        | 400/500 [02:51<00:29,  3.35it/s, loss=0.139, lr=0.0001]03/14/2023 19:28:24 - INFO - accelerate.accelerator - Saving current state to /home/jovyan/lora/output-model/checkpoint-400
03/14/2023 19:28:24 - INFO - accelerate.checkpointing - Model weights saved in /home/jovyan/lora/output-model/checkpoint-400/pytorch_model.bin
03/14/2023 19:28:24 - INFO - accelerate.checkpointing - Optimizer state saved in /home/jovyan/lora/output-model/checkpoint-400/optimizer.bin
03/14/2023 19:28:24 - INFO - accelerate.checkpointing - Scheduler state saved in /home/jovyan/lora/output-model/checkpoint-400/scheduler.bin
03/14/2023 19:28:24 - INFO - accelerate.checkpointing - Gradient scaler state saved in /home/jovyan/lora/output-model/checkpoint-400/scaler.pt
03/14/2023 19:28:24 - INFO - accelerate.checkpointing - Random states saved in /home/jovyan/lora/output-model/checkpoint-400/random_states_0.pkl
03/14/2023 19:28:24 - INFO - accelerate.checkpointing - Saving the state of AttnProcsLayers to /home/jovyan/lora/output-model/checkpoint-400/custom_checkpoint_0.pkl
03/14/2023 19:28:25 - INFO - __main__ - Saved state to /home/jovyan/lora/output-model/checkpoint-400
Steps: 100%|████████████████████████████████████████| 500/500 [03:25<00:00,  3.41it/s, loss=0.0355, lr=0.0001]03/14/2023 19:28:58 - INFO - accelerate.accelerator - Saving current state to /home/jovyan/lora/output-model/checkpoint-500
03/14/2023 19:28:58 - INFO - accelerate.checkpointing - Model weights saved in /home/jovyan/lora/output-model/checkpoint-500/pytorch_model.bin
03/14/2023 19:28:58 - INFO - accelerate.checkpointing - Optimizer state saved in /home/jovyan/lora/output-model/checkpoint-500/optimizer.bin
03/14/2023 19:28:58 - INFO - accelerate.checkpointing - Scheduler state saved in /home/jovyan/lora/output-model/checkpoint-500/scheduler.bin
03/14/2023 19:28:58 - INFO - accelerate.checkpointing - Gradient scaler state saved in /home/jovyan/lora/output-model/checkpoint-500/scaler.pt
03/14/2023 19:28:58 - INFO - accelerate.checkpointing - Random states saved in /home/jovyan/lora/output-model/checkpoint-500/random_states_0.pkl
03/14/2023 19:28:58 - INFO - accelerate.checkpointing - Saving the state of AttnProcsLayers to /home/jovyan/lora/output-model/checkpoint-500/custom_checkpoint_0.pkl
03/14/2023 19:28:58 - INFO - __main__ - Saved state to /home/jovyan/lora/output-model/checkpoint-500
Steps: 100%|█████████████████████████████████████████| 500/500 [03:25<00:00,  3.41it/s, loss=0.178, lr=0.0001]Traceback (most recent call last):
  File "/home/jovyan/lora/diffusers/examples/dreambooth/train_dreambooth_lora.py", line 1045, in <module>
    main(args)
  File "/home/jovyan/lora/diffusers/examples/dreambooth/train_dreambooth_lora.py", line 995, in main
    unet.save_attn_procs(args.output_dir)
  File "/home/jovyan/lora/diffusers/src/diffusers/loaders.py", line 273, in save_attn_procs
    weights_no_suffix = weights_name.replace(".bin", "")
AttributeError: 'NoneType' object has no attribute 'replace'

System Info

  • diffusers version: 0.15.0.dev0
  • Platform: Linux-3.10.0-1160.el7.x86_64-x86_64-with-glibc2.31
  • Python version: 3.10.6
  • PyTorch version (GPU?): 1.13.1+cu117 (True)
  • Huggingface_hub version: 0.13.1
  • Transformers version: 4.26.1
  • Accelerate version: 0.17.0
  • xFormers version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:
@killermyth killermyth added the bug Something isn't working label Mar 14, 2023
@patrickvonplaten
Copy link
Contributor

Hey @killermyth,

I sadly cannot reproduce the issue. Could it be that the PR: #2655 yesterday solves the issue?

@patrickvonplaten
Copy link
Contributor

When I'm running the script above, the training works just fine

@killermyth
Copy link
Author

thank u so much for you help!
today I pull the latest code, and I got new error like below

Model weights saved in /home/jovyan/lora/output-model/pytorch_lora_weights.bin
{'requires_safety_checker'} was not found in config. Values will be initialized to default values.
{'prediction_type'} was not found in config. Values will be initialized to default values.
{'time_embedding_type', 'conv_out_kernel', 'mid_block_type', 'projection_class_embeddings_input_dim', 'dual_cross_attention', 'timestep_post_act', 'time_cond_proj_dim', 'use_linear_projection', 'upcast_attention', 'resnet_time_scale_shift', 'only_cross_attention', 'num_class_embeds', 'class_embed_type', 'conv_in_kernel'} was not found in config. Values will be initialized to default values.
{'scaling_factor'} was not found in config. Values will be initialized to default values.
{'solver_order', 'solver_type', 'lower_order_final', 'algorithm_type', 'sample_max_value', 'thresholding', 'dynamic_thresholding_ratio'} was not found in config. Values will be initialized to default values.
Traceback (most recent call last):
File "/home/jovyan/lora/diffusers/examples/dreambooth/train_dreambooth_lora.py", line 1043, in
main(args)
File "/home/jovyan/lora/diffusers/examples/dreambooth/train_dreambooth_lora.py", line 1004, in main
pipeline.unet.load_attn_procs(args.output_dir)
File "/home/jovyan/lora/diffusers/src/diffusers/loaders.py", line 171, in load_attn_procs
state_dict = safetensors.torch.load_file(model_file, device="cpu")
File "/home/jovyan/my-conda-envs/dreambooth/lib/python3.10/site-packages/safetensors/torch.py", line 99, in load_file
with safe_open(filename, framework="pt", device=device) as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

@Ir1d
Copy link

Ir1d commented Mar 20, 2023

The code for lora is not runnable. I also get HeaderTooLarge error

@killermyth
Copy link
Author

I pull the latest code and reinstall everything in a new machine, it's works!

@Ir1d
Copy link

Ir1d commented Mar 21, 2023

@killermyth Can you test your trained model using pipe.unet.load_attn_procs? I can save but can't load when I test what I trained.

@csvt32745
Copy link

csvt32745 commented Mar 24, 2023

Can you test your trained model using pipe.unet.load_attn_procs? I can save but can't load when I test what I trained.

Just ran into the same issue.
I uninstall the safetensors package and it somehow works lol.
pip uninstall safetensors
However you might have to install it back when you'd like to save/load .safetensor files.


Updated:

Finally I have looked into this issue.
Inputing the argument use_safetensors=False solve this, since the loading function sees the file as .safetensors by default.
pipe.unet.load_attn_procs('your/lora.bin', use_safetensors=False)

@liuxz-cs
Copy link

Sorry, but I have another question, does this {'prediction_type'} was not found in config. Values will be initialized to default values. inference the training process? And why would has this warning?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants