Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LORA training 🤍 #11

Open
4lt3r3go opened this issue Dec 15, 2024 · 3 comments
Open

LORA training 🤍 #11

4lt3r3go opened this issue Dec 15, 2024 · 3 comments

Comments

@4lt3r3go
Copy link

4lt3r3go commented Dec 15, 2024

Is there any kind soul willing to write a detailed guide on how to do training on Windows for both Hunyuan and LTX?
I've been trying for three days, but unfortunately, I don't understand how to use WSL at all.

@PGCRT
Copy link

PGCRT commented Dec 20, 2024

It's pretty simple, once WSL2 installed, create a folder for the training,

like : " E:\HunyuanTrain"

run the Ubuntu console, and navigate to this folder,

"cd /mnt/e/HunyuanTrain"

Infos
Windows drives are typically mounted under /mnt. For example, the E: drive is accessible at /mnt/e.
Note that all paths needs to be conformed to linux format even for hunyuan_video and dataset .toml 

example for the dataset.toml : 

[[directory]]
path = '/mnt/e/Dataset/img/1_faces'
num_repeats = 2

[[directory]]
path = '/mnt/e/Dataset/img/2_eyes'
num_repeats = 5

example for the hunyuan_video.toml : 

transformer_path = '/mnt/c/ComfyUI_windows_portable/ComfyUI/models/diffusion_models/hunyuan_video_720_fp8_e4m3fn.safetensors'
vae_path = '/mnt/c/ComfyUI_windows_portable/ComfyUI/models/vae/hunyuan_video_vae_bf16.safetensors'
llm_path = '/mnt/c/ComfyUI_windows_portable/ComfyUI/models/LLM/llava-llama-3-8b-text-encoder-tokenizer'
clip_path = '/mnt/c/ComfyUI_windows_portable/ComfyUI/models/clip/clip-vit-large-patch14'

Clone and install the diffusion pipe repo, if you managed to install it as well as the requirements without issues then you are good to go.
If you got errors (mostly for mismatched or missing dependencys) the best way to fix them is to copy paste the error of the terminal to chatgpt to easily troubleshoot them.

Once it's properly installed all you have to do is editing the dataset.toml and hunyuan_video.toml
Then start the training by reopen the Ubuntu console,

Navigate to your trainer path, example:
cd /mnt/e/HunyuanTrain/diffusion-pipe

Activate the venv:
source venv/bin/activate

Start training using the given command
NCCL_P2P_DISABLE="1" NCCL_IB_DISABLE="1" deepspeed --num_gpus=1 train.py --deepspeed --config examples/hunyuan_video.toml

@theycallmeloki
Copy link

where do we provide the trigger word? I don't see a provision to do it anywhere in the config 🤔

@PGCRT
Copy link

PGCRT commented Dec 25, 2024

The trigger word is defined by the .txt captions. For example, if you have photos of a dog and the caption includes the word "dog," this will serve as the trigger word. You can also use special characters and modify the trigger word (e.g., "d0g") by placing it at the beginning of the caption to achieve better accuracy relative to your dataset during inference.

Ofc this can be automated

2024-12-25.19-42-38.mp4

You can download the workflow here if you want
workflow - 2024-12-25T194346 238

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants